Skip to content

Commit

Permalink
WIP - Added spec tendrl_performance_enhacements.adoc
Browse files Browse the repository at this point in the history
tendrl-bug-id: #172
Signed-off-by: Shubhendu <shtripat@redhat.com>
  • Loading branch information
Shubhendu committed Jul 19, 2017
1 parent 19ca209 commit bfbc96c
Showing 1 changed file with 172 additions and 0 deletions.
172 changes: 172 additions & 0 deletions specs/tendrl_performance_enhacements.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
= Tendrl performance enhancements for lesser CPU and memory consumption

The intent of this change is to make sure load due to tendrl components on
storage nodes is minimal. It also covers the aspects related to performant REST
apis and make sure no crashes in etcd, predictable job processing with defined
CPU and memory uses.

It also tends to define the hardware requirements for standard tendrl server
and load incurred on the storage nodes due to tendrl components.


== Problem description

This specification talk about various changes required in tendrl components to
make it more performant and make sure they consume less resources (CPU, memory)
on storage nodes. It also covers the guidelines to storage admin for required
hardware for tendrl server, etcd clustering and load incurred on storage nodes.


== Use Cases

* This addresses the changes in the way tendrl entities get written and read
to/from etcd. Currently the objects get written field by field which is CPU
intensive and needs more resources.

* The job processor in tendrl, consistently looks at `/queue` etcd dir for
finding the jobs to be processed. We need a tagged job queue mechanism which
reduces huge fetching and probing of the `/queue` jobs. With tagged job queues,
specific services would look for the interesting specific job queues and they
would process jobs them only.

* Provide guidelines on standard hardware requirements for tendrl server node

* Provide guidelines on setting up a clustered etcd for tendrl

* Tuning of REST endpoints for better performance and predictable response time

* Tuning of different components of tendrl for better memory utilizations


== Proposed change

* Annotate flows in tendrl definition files with tagged queue names (to which
these flows would write the job to)

* Introduce a tagged job queue mechanism in `tendrl-commons` module. Services
with defined tags would pick jobs from their specific tagged job queues for
processing

* Enhance REST layer to create job in tagged job queues based on flow annotation
for job queue names

* Enhance writing/reading to/from etcd to consider whole object details as
single JSON. While writing we need to get the json representation of the object
and write as single field under etcd. While reading, it should be read as single
value and whole object should be weaved back from JSON.

* Fine tune REST endpoints for better and faster response times

* Document the hardware requirements for tendrl server under wiki

* Document the clustering mechanism of etcd in wiki

* Document the details of load incurred on storage nodes due to tendrl
components within justified limits (so that storage admin can plan the resource
requirements accordingly)

=== Alternatives

None

=== Data model impact

* Annotate the tendrl flows in different definitions files of tendrl modules to
define the tagged queue name where these jobs would be written

=== Impacted Modules:

==== Tendrl API impact:

* REST layer to write the jobs in tagged queues based on definitions

* Enhancements for tuning the response time for various GET endpoints

==== Notifications/Monitoring impact:
None

==== Tendrl/common impact:

* Enhancements for processing tagged job queues. Based on the current service,
it should look at defined tagged job queue only for figuring out the jobs to be
picked and processed

* Enhance the writing/reading logic to/from etcd to consider the whole object as
single JSON

==== Tendrl/node_agent impact:

* Definitions changes for tagging flows with specific job queue names

==== Sds integration impact:

* Definitions changes for tagging flows with specific job queue names

==== Tendrl Dashboard impact:

None

=== Security impact:

None.

=== Other end user impact:

None

=== Performance impact:

None.

=== Other deployer impact:

None.

=== Developer impact:

None.


== Implementation:

<Add specific issues here>

=== Assignee(s):

Primary assignee:
shtripat
r0h4n
anivargi

=== Work Items:

* https://github.com/Tendrl/specifications/issues/172


== Dependencies:

None


== Testing:

* Verify that load incurred on storage nodes due to tendrl components is within
the defined limits

* Verify the REST endpoints for their response time and it should be within the
defined time limits

* Verify the guidelines published regarding clustering of etcd


== Documentation impact:

* Document for clustered setup of etcd

* Document for hardware requirements for tendrl server

* Document for load details on storage nodes due to tendrl components

== References:

None

0 comments on commit bfbc96c

Please sign in to comment.