Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

database of symbols #123

Open
Decencies opened this issue Jul 15, 2023 · 6 comments
Open

database of symbols #123

Decencies opened this issue Jul 15, 2023 · 6 comments
Labels
backend affects the enigma backend enhancement New feature or request

Comments

@Decencies
Copy link
Contributor

Decencies commented Jul 15, 2023

The enigma core should construct a database of symbols, and their references for the program.
This will (in theory) enable the projects to load faster once the database has been built.

The database should store symbols for:

  • Classes
  • Fields
  • Methods
  • Parameters / Locals

When a program is initially analyzed, the database is built.
Upon next launch, the program will load the symbol tree from the database into memory (lazily?)

This will cut down the unecassary computational costs when re-analyzing the program.

Some considerations for the file format:
We should use a constant-pool like approach, for example:
( denotes an 1 byte opcode)

<cls> java/lang/Object # entry 0
<name> equals # entry 1
<desc> (0)V # entry 2
<method> 0 1 2 # entry 3

This is done to save storage space when building the database, and it's also primarly how the JVM structures it's class-file constant pool.

@ix0rai ix0rai added enhancement New feature or request backend affects the enigma backend labels Jul 15, 2023
@ix0rai
Copy link
Member

ix0rai commented Jul 16, 2023

on the implementation side I have a couple things:

  • we can associate jars with cache files based on hash
  • cache files should use the same directory as enigma's config
  • I think json is best

@ix0rai ix0rai added this to the 2.0.0 milestone Jul 16, 2023
@Decencies
Copy link
Contributor Author

Decencies commented Jul 16, 2023

on the implementation side I have a couple things:

  • we can associate jars with cache files based on hash

  • cache files should use the same directory as enigma's config

  • I think json is best

My idea was to rebuild the program entirely as a tree, which includes all class, field and method references. This will allow the database to be loaded even if the original JAR file is lost.

Another bonus of this is, if we were to use an efficient data storage algorithm, we could cut down 10-20mb input JARs in just a few hundred kilobytes, which is definitely a huge improvement.

@ix0rai
Copy link
Member

ix0rai commented Jul 16, 2023

that's an interesting idea!
I think we should implement this soon, since we'd be able to use the same format for both the cache and the eventual packet sending the full jar to the client in enigma-server

@uniformization
Copy link
Contributor

… and the eventual packet sending the full jar to the client in enigma-server

There might be some potential legality issues regarding that

@ix0rai
Copy link
Member

ix0rai commented Sep 3, 2023

in theory, but I can't see anyone actually getting mad at us over it

@Decencies
Copy link
Contributor Author

… and the eventual packet sending the full jar to the client in enigma-server

There might be some potential legality issues regarding that

It is up to the user to determine who can access their service, and subsequently who can access the JAR file served.

@ix0rai ix0rai removed this from the 2.0.0 milestone Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend affects the enigma backend enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants