hermes_rs

Note: Still a WIP - A PR is always welcome.

A dependency-free disassembler and assembler for the Hermes bytecode, written in Rust.

A special thanks to P1sec for digging through the Hermes git repo, pulling all of the BytecodeDef files out, and tagging them. This made writing this tool much easier.

hermes_rs
Hermes Resources
Development
- Supporting new versions of Hermes
- TODO

Supported HBC Versions

HBC Version	Disassembler	(Binary) Assembler	(Textual) Assembler	Decompiler
89	✅	✅	❌	❌
90	✅	✅	❌	❌
93	✅	✅	❌	❌
94	✅	✅	❌	❌
95	✅	✅	❌	❌
96	✅	✅	❌	❌

Project Goals

Full coverage for all public HBC versions
The ability to inject code stubs directly into the .hbc file for instrumentation
Textual HBC assembly

Potential Use cases

Find which functions reference specific strings
Generate frida hooks for mobile implementations
- hermes loader -> hook loading the package -> feed to hermes_rs -> patch code for bidirectional communication or even just logging
Writing fuzzers

Features

bLaZiNgLy FaSt
Export strings
Export bytecode
Encoding instructions piecemeal
Reduce binary size by only enabling certain versions of HBC

Installation

cargo add hermes_rs

Usage

Reading File Header

let f = File::open("./test_file.hbc").expect("no file found");
let mut reader = io::BufReader::new(f);
let header: HermesHeader = HermesStruct::deserialize(&mut reader);

println!("Header: {:?}", header);

Output:

{
  magic: 2240826417119764422,
  version: 94,
  sha1: [ 13, 37, 133, 71, 337, 17, 182, 139, 155, 223, 133, 7, 132, 109, 21, 96, 3, 12, 19, 56],
  file_length: 1102,
  global_code_index: 0,
  function_count: 3,
  string_kind_count: 2,
  identifier_count: 5,
  string_count: 14,
  overflow_string_count: 0,
  string_storage_size: 88,
  big_int_count: 0,
  big_int_storage_size: 0,
  reg_exp_count: 0,
  reg_exp_storage_size: 0,
  array_buffer_size: 0,
  obj_key_buffer_size: 0,
  obj_value_buffer_size: 0,
  segment_id: 0,
  cjs_module_count: 0,
  function_source_count: 0,
  debug_info_offset: 628,
  options: BytecodeOptions {
    static_builtins: false,
    cjs_modules_statically_resolved: false,
    has_async: false,
    flags: false
  },
  function_headers: [
    SmallFunctionHeader {
      offset: 348,
      param_count: 1,
      byte_size: 69,
      func_name: 5,
      info_offset: 576,
      frame_size: 15,
      env_size: 2,
      highest_read_cache_index: 2,
      highest_write_cache_index: 0,
      flags: FunctionHeaderFlag {
        prohibit_invoke: ProhibitNone,
        strict_mode: false,
        has_exception_handler: true,
        has_debug_info: true, overflowed: false
        }
    },
    // ...
  ],
  string_kinds: [
    StringKindEntry { count: 9, kind: String },
    StringKindEntry { count: 5, kind: Identifier }
  ],
  string_storage: [
    SmallStringTableEntry { is_utf_16: false, offset: 0, length: 4},
    // ...
  ],
  string_storage_bytes: [ 119, 101, 101, ... ],
  overflow_string_storage: []
}

Reading Function Headers

header.function_headers.iter().for_each(|fh| {
  println!("function header: {:?}", fh);
});

// Prints the following:
// function header: SmallFunctionHeader { ... } }
// ...

Parsing Bytecode

header.parse_bytecode(&mut reader);

// By default, prints the following. It is assumed that the end user will
// bring their own functionality to play with the instructions as-needed

/*
Function<foo>(1 params, 1 registers, 0 symbols):
  LoadConstString  r0,  "bar"
  AsyncBreakCheck
  Ret  r0
*/

Encoding Instructions

Encoding instructions is trivial - each Instruction implements a trait with deserialize and serialize methods.

You can alias the imports for your version to shorten the code, but fully-expanded it may look something like this:

let load_const_string = hermes::v94::Instruction::LoadConstString(hermes::v94::LoadConstString {
  op: hermes::v94::str_to_op("LoadConstString"),
  r0: 0,
  p0: 2,
});

let async_break_check = hermes::v94::Instruction::AsyncBreakCheck(hermes::v94::AsyncBreakCheck {
  op: hermes::v94::str_to_op("AsyncBreakCheck"),
});

let ret = hermes::v94::Instruction::Ret(hermes::v94::Ret {
    op: hermes::v94::str_to_op("Ret"),
    r0: 0,
});

let instructions = vec![load_const_string, async_break_check, ret];

let mut writer = Vec::new();
for instr in instructions {
    instr.serialize(&mut writer);
}

// Make sure the encoded bytes are valid
assert!(writer == vec![115, 0, 2, 0, 98, 92, 0], "Bytecode is incorrect!");

Using specific HBC Versions

Want to use a specific version of the Hermes bytecode and reduce your binary size?

In Cargo.toml, find the hermes_rs dependency and select which HBC version(s) you'd like to use in your application.

Example:

[dependencies]
hermes_rs = { features = ["v89", "v90", "v93", "v94", "v95"] }

Hermes Resources

I leaned heavily on the following projects and resources to develop this package.

Official docs: https://hermesengine.dev/
- Source: https://github.com/facebook/hermes
hermes-dec disassembler/decompiler:
- https://github.com/P1sec/hermes-dec
- Opcode Docs: https://p1sec.github.io/hermes-dec/opcodes_table.html
hbctool: https://github.com/bongtrop/hbctool
hasmer (stale): https://github.com/lucasbaizer2/hasmer

Development

Supporting new versions of Hermes

There is a script in ./def_versions/_gen_macros.js that reads and parses a Bytecode Definition file passed to it as the first argument and outputs a file containing the macro body to support the updated instructions.

# How I generated them

cd ./def_versions

node _gen_macros.js 89.def > ../src/hermes/v89/mod.rs
node _gen_macros.js 90.def > ../src/hermes/v90/mod.rs
node _gen_macros.js 93.def > ../src/hermes/v93/mod.rs
node _gen_macros.js 94.def > ../src/hermes/v94/mod.rs
node _gen_macros.js 95.def > ../src/hermes/v95/mod.rs

Example with a hypothetical v100 version :

node _gen_macros.js v100.def

Which outputs:

use crate::hermes;

build_instructions!(
  (0, Unreachable, ),
  (1, NewObjectWithBuffer, r0: Reg8, p0: UInt16, p1: UInt16, p2: UInt16, p3: UInt16),
  (2, NewObjectWithBufferLong, r0: Reg8, p0: UInt16, p1: UInt16, p2: UInt32, p3: UInt32),
  (3, NewObject, r0: Reg8),
  (4, NewObjectWithParent, r0: Reg8, r1: Reg8),
  ... other instructions here
);

From here, you'll add a new directory and mod.rs file for this version (./src/hermes/v100/mod.rs) and paste the output from the script into it.

This could (and probably should) be a build.rs process.

After creating this file, open up ./src/hermes/mod.rs and navigate to the Instruction module imports and add the import, then populate the Instruction enum + trait + other functions' match statements with the new version. You'll likely need to rely on the compiler to complain about missing match branches - there's only a few, though.

As this codebase evolves, you may need add branch arms in different matches.

#[macro_use]
#[cfg(feature = "v100")]
pub mod v100;

// ...

pub enum Instruction {
  // ...
  #[cfg(feature = "v100")]
  V100(v100::Instruction),
}

// ...

impl Instruction {
  // implement the methods of the trait
  fn display(&self, _hermes: &HermesHeader) -> String{
      match self {
        // ...
        #[cfg(feature = "v100")]
        Instruction::V100(instruction) => instruction.display(_hermes),
      }
  }

  fn size(&self) -> usize {
      match self {
          // ...
          #[cfg(feature = "v100")]
          Instruction::V100(instruction) => instruction.size(),
      }
  }
}


// ...

// In parse_bytecode there's currently a match statement that will also need to be populated...
let ins_obj: Option<Instruction> = match self.version {
  #[cfg(feature = "v89")]
  89 => Some(Instruction::V89(v89::Instruction::deserialize(&mut r_cursor, op))),
  #[cfg(feature = "v90")]
  90 => Some(Instruction::V90(v90::Instruction::deserialize(&mut r_cursor, op))),
  #[cfg(feature = "v93")]
  93 => Some(Instruction::V93(v93::Instruction::deserialize(&mut r_cursor, op))),
  #[cfg(feature = "v94")]
  94 => Some(Instruction::V94(v94::Instruction::deserialize(&mut r_cursor, op))),
  #[cfg(feature = "v95")]
  95 => Some(Instruction::V95(v95::Instruction::deserialize(&mut r_cursor, op))),
  _ => None,
};

Finally, add the feature (v100 = []) to Cargo.toml.

TODO

DebugInfo definition stuff
Add comments
Add docs
Serializer implementations for everything

Struct	Deserialize	Serialize	Size
HermesHeader	✅	❌	❌
SmallFunctionHeader	✅	❌	❌
FunctionHeader	✅	✅	✅
StringKindEntry	✅	✅	✅
SmallStringTableEntry	✅	✅	✅
OverflowStringTableEntry	✅	✅	✅
BigIntTableEntry	✅	❌	❌
BytecodeOptions	✅	✅	✅
DebugInfoOffsets	✅	✅	✅
DebugInfoHeader	✅	✅	✅
DebugFileRegion	✅	✅	✅
ExceptionHandlerInfo	✅	✅	✅
RegExpTableEntry	✅	✅	✅
FunctionHeaderFlag	✅	❌	❌

Parse in the correct order:

// From official Hermes source code
void visitBytecodeSegmentsInOrder(Visitor &visitor) {
  visitor.visitFunctionHeaders();
  visitor.visitStringKinds();
  visitor.visitIdentifierHashes();
  visitor.visitSmallStringTable();
  visitor.visitOverflowStringTable();
  visitor.visitStringStorage();
  visitor.visitArrayBuffer();
  visitor.visitObjectKeyBuffer();
  visitor.visitObjectValueBuffer();
  visitor.visitBigIntTable();
  visitor.visitBigIntStorage();
  visitor.visitRegExpTable();
  visitor.visitRegExpStorage();
  visitor.visitCJSModuleTable();
  visitor.visitFunctionSourceTable();
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.vscode		.vscode
def_versions		def_versions
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hermes_rs

Supported HBC Versions

Project Goals

Potential Use cases

Features

Installation

Usage

Reading File Header

Reading Function Headers

Parsing Bytecode

Encoding Instructions

Using specific HBC Versions

Hermes Resources

Development

Supporting new versions of Hermes

TODO

About

Releases

Packages

Languages

HBC Version	Disassembler	(Binary) Assembler	(Textual) Assembler	Decompiler
89	✅	✅	❌	❌
90	✅	✅	❌	❌
93	✅	✅	❌	❌
94	✅	✅	❌	❌
95	✅	✅	❌	❌
96	✅	✅	❌	❌

HBC Version	Disassembler	(Binary) Assembler	(Textual) Assembler	Decompiler
89	✅	✅	❌	❌
90	✅	✅	❌	❌
93	✅	✅	❌	❌
94	✅	✅	❌	❌
95	✅	✅	❌	❌
96	✅	✅	❌	❌

Pilfer/hermes_rs

Folders and files

Latest commit

History

Repository files navigation

hermes_rs

Supported HBC Versions

Project Goals

Potential Use cases

Features

Installation

Usage

Reading File Header

Reading Function Headers

Parsing Bytecode

Encoding Instructions

Using specific HBC Versions

Hermes Resources

Development

Supporting new versions of Hermes

TODO

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages

HBC Version	Disassembler	(Binary) Assembler	(Textual) Assembler	Decompiler
89	✅	✅	❌	❌
90	✅	✅	❌	❌
93	✅	✅	❌	❌
94	✅	✅	❌	❌
95	✅	✅	❌	❌
96	✅	✅	❌	❌