Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up model parsing #208

Open
MichaelClerx opened this issue Feb 3, 2020 · 7 comments
Open

Speed up model parsing #208

MichaelClerx opened this issue Feb 3, 2020 · 7 comments

Comments

@MichaelClerx
Copy link
Contributor

Currently this is very very slow. For example, Myokit is approx ~30 times faster (only checked one file). We should do some profiling to see what causes this, and then check if there's some easy fixes to speed things up

@jonc125
Copy link
Contributor

jonc125 commented Jun 15, 2020

@MauriceHendrix would you be up for doing some profiling on e.g. regenerating the Chaste codegen references, so we have some data to guide optimisation?

I have a couple of ideas for improving the performance of get_equations_for specifically, but don't want to invest time in them unless it's going to be noticed!

  1. Cache the result of sorted_variables = nx.lexicographical_topological_sort(graph, key=str) for each graph, since that doesn't change unless the graph does. Does this method take a long time?
  2. Change the logic in get_equations_for so it doesn't do a sort, instead working backwards along graph edges from the nodes supplied to build a list of just the equations actually needed. It's possible this would change the ordering in generated code, but I think it should be a fair bit faster as it builds fewer interim data structures.

@MauriceHendrix
Copy link
Contributor

sure would be interested. Also in codegen in a few places I add stuff to the model and than just use get_equatons_for again. It would probably be faster if I just updated the list I already had, but that would be quite a bit of extra work.

@MauriceHendrix
Copy link
Contributor

I just ran a cprofile on load_model. By far the biggest time sinks are random, shuffle and xreplace. I wonder where random is being used from. For xreplace we know it's much faster than subs, but even so is there a way to reduce calls to xreplace?

ncalls tottime percall cumtime percall filename:lineno(function)
658184/36148 1.197 0.000 5.693 0.000 basic.py:1171(_xreplace)
2200631 1.474 0.000 2.120 0.000 random.py:224(_randbelow)
138250 1.017 0.000 3.153 0.000 random.py:264(shuffle)

@jonc125
Copy link
Contributor

jonc125 commented Jun 15, 2020

Probably useful to see the call graph, since there are various places we call into Sympy!

@MauriceHendrix
Copy link
Contributor

load_model

@MauriceHendrix
Copy link
Contributor

@jonc125 I tried both suggestions for cellmlmanip and the results are pretty disappointing:
I've tested it by generating code for Shannon2004 10* using timeit.timeit. Times below are everage per time

  • Just the load time is about: 6.8s
  1. Time for as is generating (from pre-loaded model object): 7.5s
  2. Time for caching the sorted_variables: 7.3s
  3. Time for not sorting at all: 7.3s - (this also generates uncompilable c++ code)
  4. If I hardcode recurse=False: 5.2s

@MauriceHendrix
Copy link
Contributor

MauriceHendrix commented Jun 23, 2020

Right @jonc125 and @MichaelClerx
I now have 3 separate pull requests:

  • Optional unsorted results #302 adds the an option to get unsorted results from the get_state_variables, get_derivatives, get_derived_quantities and get_variables_by_rdf methods of Model. The reason that that's useful for me is that with these calls I either resort them myself differently or I put them into a set, so the sort has become unnecessary for codegen. I have of course updated the tests to verify the results.
  • Parser speed optimisation #303 changes the parser to delay adding maths, untill after we know about the connections, also extended the tests to hit the parts of connect_variables that weren't being hit.
  • cache sorting in get_equations_for #304 caching the sorted graph for the get_equations_for method

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants