Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no_std is broken, doesn't play with sgx #31

Open
brenzi opened this issue Apr 29, 2019 · 46 comments
Open

no_std is broken, doesn't play with sgx #31

brenzi opened this issue Apr 29, 2019 · 46 comments

Comments

@brenzi
Copy link

brenzi commented Apr 29, 2019

I'm trying to use this lib within sgx using

https://github.com/baidu/rust-sgx-sdk

It seems that #[no_std] is broken as std gets pulled in in places (like context.rs).
Also alloc is not imported

@elichai commented the following: apache/incubator-teaclave-sgx-sdk#93 (comment)

I'll need schnorrkel to support sgx for substraTEE

@brenzi
Copy link
Author

brenzi commented Apr 29, 2019

When I build this crate with

cargo build --release --no-default-features --features "u64_backend alloc"

errors explode.

the low hanging fruit:

diff --git a/src/context.rs b/src/context.rs
index 44d9ab3..ec56bc1 100644
--- a/src/context.rs
+++ b/src/context.rs
@@ -9,7 +9,7 @@
 
 //! ### Schnorr signature contexts and configuration, adaptable to most Schnorr signature schemes.
 
-use std::cell::RefCell;
+use core::cell::RefCell;
 
 use rand::prelude::*;  // {RngCore,thread_rng};

But that doesn't do the trick by far

@burdges
Copy link
Collaborator

burdges commented Apr 30, 2019

I've pushed a fix that passes your build command. It suffices? Thanks!

I'll publish a new version soon, but we've several compatibility breaks already in master and..

I'd love to resolve #9 along with the other compatibility breaks, but so far nothing conclusive arose. I'll make a decision there asap. :)

@burdges
Copy link
Collaborator

burdges commented Apr 30, 2019

Appears ed25519-dalek lacks this extern crate alloc; that I needed, so not sure if I fixed those errors correctly or not.

@brenzi
Copy link
Author

brenzi commented Apr 30, 2019

Cheers for your attempt to fix this issue.

I now get

error[E0658]: use of unstable library feature 'alloc': this library is unlikely to be stabilized in its current form or name (see issue #27783)
   --> /home/abrenzikofer/.cargo/git/checkouts/schnorrkel-0d699dd19a5d19bc/bc1cf8e/src/lib.rs:262:1
    |
262 | extern crate alloc;
    | ^^^^^^^^^^^^^^^^^^^
    |
    = help: add #![feature(alloc)] to the crate attributes to enable

error: aborting due to previous error

So I must guess this issue isn't fixed just yet - not only because of upstream crate ed25519-dalek

@brenzi
Copy link
Author

brenzi commented Apr 30, 2019

edit: sorry, this issue should be fixed by updating rust nightly

There are further issues when building without alloc feature:

> cargo build --release --no-default-features --features "u64_backend"
error[E0432]: unresolved import `crate::sign::verify_batch`
   --> src/lib.rs:308:50
    |
308 | pub use crate::sign::{Signature,SIGNATURE_LENGTH,verify_batch};
    |                                                  ^^^^^^^^^^^^ no `verify_batch` in `sign`

error[E0433]: failed to resolve: use of undeclared type or module `BTreeMap`
   --> src/musig.rs:373:22
    |
373 |         let mut Rs = BTreeMap::new();
    |                      ^^^^^^^^ use of undeclared type or module `BTreeMap`

...

@brenzi
Copy link
Author

brenzi commented Apr 30, 2019

Doesn't build with rustsgx-sdk. Compiler still complains:

error: duplicate lang item in crate `std`: `f32_runtime`.
  |
  = note: first defined in crate `sgx_tstd`.

error: duplicate lang item in crate `std`: `f64_runtime`.
  |
  = note: first defined in crate `sgx_tstd`.

error: duplicate lang item in crate `std`: `panic_impl`.
  |
  = note: first defined in crate `sgx_tstd`.

error: duplicate lang item in crate `std`: `oom`.
  |
  = note: first defined in crate `sgx_trts`.

It looks like one or more of your dependencies don't support no_std: merlin ?

@burdges
Copy link
Collaborator

burdges commented Apr 30, 2019

If I understand, the #![feature(alloc)] error comes form your rust being outdated, yes?

I've fixed the issues I understood and pushed the changes, in part by omitting musig without alloc or std. I even made cargo test .. work with various combinations, which eve dalek does not do.

I still do not understand why ed25519-dalek does not require extern crate alloc; like schnorrkel seemingly does. An edition setting in Cargo.toml maybe?

I also do not understand the no_std: merlin issue. I know merlin's tests use strobe-rs, which requires Vec, but outside tests I'm unsure about the problem. Can you pin down the actual problem?

@burdges burdges reopened this Apr 30, 2019
@burdges
Copy link
Collaborator

burdges commented Apr 30, 2019

There is a serious dependency flaw caused by the rand crate being chopped up into too many poinless subcrates: rust-random/rand#738 Is that the problem? I'm afraid merlin does create an Rng, but some workaround exist I think.

@burdges
Copy link
Collaborator

burdges commented May 1, 2019

Or maybe my Cargo.toml does not pass through some feature flags correctly?

@brenzi
Copy link
Author

brenzi commented May 1, 2019

right now we're trying to use ed25519 signatures with our node as a workaround. I'll revert when I find time for further debugging

@burdges
Copy link
Collaborator

burdges commented May 1, 2019

I've found the problem maybe.. merlin lacks any support for no_std, like you said.

I believe https://github.com/dalek-cryptography/merlin/blob/master/src/strobe.rs#L4 might be the only problematic line in merlin. Can you tell if using this branch of merlin works? It merely uses use core::ops::{Deref, DerefMut}; and passes through the std feature.

@burdges
Copy link
Collaborator

burdges commented May 1, 2019

It appears my second commit there to Cargo.toml breaks their CI, but the first seemingly passed fine. It's maybe not required to pass through the std dependency unless you really use functionality only present with std I guess?

@brenzi
Copy link
Author

brenzi commented May 1, 2019

I just tried to use your merlin branch within sgx. no success. Same duplicate lang item error as above

toml:

merlin = { git ="https://github.com/burdges/merlin.git", branch="no_std", default-features = false}

code:

extern crate merlin;
use merlin::{Transcript};

@burdges
Copy link
Collaborator

burdges commented May 1, 2019

It's likely the clear_on_drop crate dalek-cryptography/curve25519-dalek#236 (comment) cesarb/clear_on_drop#20 but that'll likely impact ed25519-dalek too.

We should switch from clear_on_drop to zeroize whenever curve25519-dalek merges dalek-cryptography/curve25519-dalek#236

We probably could not publish a schnorrkel version in which no std required a curve25519-dalek branch, but your build can work fine from branches I suppose? It's also possible to temporarily disable zeroing when building for WASM by defining dummy ClearOnDrop trait when required.

We could investigate the rand micro-crate bug further I mentioned above #31 (comment) this does not fix it.

@burdges
Copy link
Collaborator

burdges commented May 12, 2019

I pushed a temporary fix in d5a4835 but not sure if it helps you really. We'll switch to the Zeroize crate whenever it works, so I'll leave this open.

@brenzi
Copy link
Author

brenzi commented May 14, 2019

Doesn't work for me: error: duplicate lang item in crate std
We're working with an ed25519 chain now, so no hurry. We're fine for now.

@burdges
Copy link
Collaborator

burdges commented May 14, 2019

I cannot reproduce that error myself, but it sounds like something is importing std and core separately or something.

@isislovecruft
Copy link
Contributor

If you're okay with using nightly compilers only, I have a merlin branch that might work for you: dalek-cryptography/merlin#42

@burdges
Copy link
Collaborator

burdges commented Jun 4, 2019

Appears we lacked many default_features = false lines here too. Any idea if d5a4835 helps @brenzi ? I want to revert it if possible.

@brenzi
Copy link
Author

brenzi commented Jun 4, 2019

@burdges thanks for the update. I won't be able to test it until next week.

@burdges
Copy link
Collaborator

burdges commented Jul 26, 2019

We moved to the zeroize crate's trait with a manual impl, due to proc macros not working on both nightly and stable. I'm afraid clear_on_drop remains a curve25519-dalek dependency, but maybe this reduces conflicts if another crate imported clear_on_drop differently.

@burdges
Copy link
Collaborator

burdges commented Aug 5, 2019

Right now cargo build --release --no-default-features --features "u64_backend alloc" works, but..

Afaik, there is no way ::rand::thread_rng() should compile without std due to https://docs.rs/rand/0.6.5/src/rand/rngs/mod.rs.html#156

Any ideas why this works for me @dhardy ? I should actually supply some build environment without an OsRng perhaps?

@burdges
Copy link
Collaborator

burdges commented Aug 6, 2019

I reduced the dependence on rand to just rand_core in 21a039e but with a rand feature, which seemingly reduces dependence on std dramatically. It'll use rand_os without the rand feature, which gets slower, but works.

I still do not understood how thread_rng exists without std, but currently the default randomness source without std is PanicRng which simply panics: 21a039e#diff-b4aea3e418ccdb71239b96952d9cddb6R244

It's possible the rand_os dependency should be optional too, but we should understand how rand really works in no_std first.

@chesterliliang
Copy link

Hi Burdges.

When we try to compile schnorrkel to target thumbv7m-none-eabi which is for ARM cortex M3, we found there are some issues generated by 'rand'. Most of them are caused by usage of thread_rng(), but some of them are hard to tell because errors changing on different version of rand.

We have a suggestion that considering schnorrkel may porting to different platform including MCU for IOT devices or hardware wallet, a branch which let users choose the random source by themselves and transfer it into our code. Here are benefits if we make this branch:

  1. Make the master branch need not to take care this kind of issues. The difference could be separated by features if branch merged like
  #[cfg(feature = "ext_rng")]
    pub fn sign_trng<T: SigningTranscript>(&self, mut t: T, trng: &[u8], public_key: &PublicKey) -> Signature 
    {
        t.proto_name(b"Schnorr-sig");
        t.commit_point(b"pk\x00",public_key.as_compressed());

        let r = t.witness_scalar_trng(b"signing\x00",&[&self.nonce],trng);  // context, message, A/public_key
        let R = (&r * &constants::RISTRETTO_BASEPOINT_TABLE).compress();

         t.commit_point(b"no\x00",&R);
        let k: Scalar = t.challenge_scalar(b"sign\x00");  // context, message, A/public_key, R=rG
        let s: Scalar = &(&k * &self.key) + &r;

        Signature{ R, s }
    }
  1. Generate the pseudo-random number will cost much for little MCU system, not only the time, but also need to consider if the entropy source is good enough or even exsit. Since many chips don't run any OS at all in IOT world and they don't have good RTC schematic or even dot not have anything about time which is main entropy source of pseudo-random number. So it is not just the no
    -std problems in RUST. However, the hardware system may have other good entropy source such as noise. And they could use it by accessing registers in serval lines of code:
void GetTRNG(uint16_t len, uint8_t * dataOut)
{
...

 RCC_SEC1Periph_ClockCmd(RCC_SEC1Periph_TRNG,ENABLE);
 
 while(wordLen--)
 {
  RNGCLR = 0x01;

  EVINT_MCUEV_Cmd(TRNG_IRQn,ENABLE);
  RNGCON = 0x01;  //start TRNG
  while(!(RNGSTS & 0x01))
  {     
   __WFE();
  }
  EVINT_MCUEV_Cmd(TRNG_IRQn,DISABLE);
  
  *(uint32_t *)(dataOut) = RNGDAT;
  dataOut += 4;
 }
 
 ...
 RNGSTOP = 0x01; //stop RNG
 
 RCC_SEC1Periph_ClockCmd(RCC_SEC1Periph_TRNG,DISABLE);
}
  1. So it is convenient for the embedded developer to choose their way to generate RNG if we have this branch. And moreover, if we fix the RNG generate methond in schnorrkel but our depends works not well when porting to other platform, it make rise the security risks for the users which may beyond our contrl.

We did some test in our fork wookong-dot-schnorrkel. An embedded folder added with cortex-rust project which could run QEMU, ARM simulator. You could try it with the steps in README.md

@burdges
Copy link
Collaborator

burdges commented Aug 6, 2019

I fixed 1 in 21a039e I think. Infact, yYou could always replace thread_rng by calling attach_rng, assuming your system has good randomness, but I removed the dependency in that commit.

In principle, you could create a wrapper type pub struct TranscriptWithMyRng(::merlin::Transcript);, delegate required methods of SigningTranscript, and make witness_bytes call your good system CSPRNG. attach_rng is easier if you only do this occasionally though.

I think 2 sounds out of scope, not sure I understand 3 yet. There is considerable work on tools like jitter_rng, etc. from the rand crate developers, but mostly they gave up on creating their own randomness. I think the getrandom crate provide an OS interface used by rand_os.

@chesterliliang
Copy link

21a039e works with cargo build --release --no-default-features --features "u64_backend alloc"
but fails with
cargo build --release --target thumbv7m-none-eabi --no-default-features
or
cargo build --release --target thumbv7m-none-eabi --no-default-feature --features "u64_backend alloc"

ChestersdeMacBook-Pro:schnorrkel chester$ cargo build --release --target thumbv7m-none-eabi --no-default-features
   Compiling typenum v1.10.0
   Compiling cc v1.0.38
   Compiling byteorder v1.3.2
   Compiling rand_core v0.4.0
   Compiling byte-tools v0.3.1
   Compiling subtle v2.1.1
   Compiling fake-simd v0.1.2
   Compiling opaque-debug v0.2.3
error[E0463]: can't find crate for `std`
  |
  = note: the `thumbv7m-none-eabi` target may not be installed

error: aborting due to previous error

For more information about this error, try `rustc --explain E0463`.
error: Could not compile `rand_core`.
warning: build failed, waiting for other jobs to finish...
error: build failed

@dhardy
Copy link

dhardy commented Aug 6, 2019

Sorry @burdges I don't know the answer to that question.

@burdges
Copy link
Collaborator

burdges commented Aug 6, 2019

Thanks anyways @dhardy :)

It's opaque-debug that fails due to some no_std between it and that platform? It's seemingly used by sha2 and sha3. Can you build sha2 all by itself? If not ask @newpavlov

I sadly cannot purge sha2 from this crate, due to kusama using the ExpansionMode::Ed25519 option. I could however feature gate sha2 so that you can avoid using it if you never transfer secrets with kusama's subkey, etc. And we'll might dump ExpansionMode::Ed25519 in polkadot. Also, you could fork schnorrkel using some sha2 crate without the complex dependencies.

Just fyi, we cannot purge the digest crate because curve25519-dalek depends upon it, but we do not depend upon digest for anything important, so again digest could be avoided in some fork of both curve25519-dalek and schnorrkel. We'd then have dependencies down to exclusive dalek ecosystem and lower level rand crates.

@newpavlov
Copy link

newpavlov commented Aug 6, 2019

Attempting to remove sha2 just because you think it breaks no_std build looks like an overkill to me. If problem indeed originates in RustCrypto crates, please, report it and I will try to fix it ASAP.

Anyway, I've just re-run no_std CI test for RustCrypto/hashes and it has passed without any issues. I highly doubt problem is in opaque_debug, as you can see here, code is dead simple and does not use std anywhere.

Also, please read the compilation error message more carefully:

error: Could not compile rand_core.

I guess, std feature gets enabled for rand_core, so probably one of the dependencies has forgot to use default-features = false and correctly cascade std feature.

@burdges
Copy link
Collaborator

burdges commented Aug 7, 2019

Right, thank you @newpavlov ! :) I jumped on sha2, etc. because they should serve no purpose here, but I made some stupid mistakes early that resulted in their inclusion.

I tweaked the rand handling in 2d4c2ef @chesterliliang by copying the feature propagation from https://docs.rs/crate/rand/0.6.5/source/Cargo.toml but..

I do not understand the relationship between rand_os/getrandom and std, and my current scheme looks incorrect, so no promises.

Can you take a look at that commit next week? @jacogr Are the wasm, etc. features doing the right thing?

It appears merlin and curve25519-dalek require different rand_core versions, likely stubbed together, but even the stubs need their default-features = false lines.

@jacogr
Copy link
Contributor

jacogr commented Aug 7, 2019

@burdges The WASM compilation (via the Rust wasmpack), creates externs for the random functions and then you need to provide these via the Node.js crypto module, which is effectively only getRandomBytes and randomFillsync functions.

(So it has not been an issue at all in the past with the 0.1.x up to 0.8.4 - well, different environments such as React Native do have some issues, but that is solved via polyfills for the extern crypto module, doesn't work out of the box, but with some project tweaks and filling in the gaps, it does)

@burdges
Copy link
Collaborator

burdges commented Aug 7, 2019

Any idea how rand::thread_rng exists when substrate compiles schnorrkel no_std? Or maybe we're not no_std? If not, then I'm not worried here.

@chesterliliang
Copy link

Rand use prelude to re-export and I see some discussion about no_std and crate:prelude in std-using paths work just fine in 2018 edition #![no_std] and Add extern crate items to extern prelude. Not sure if it is related to our problem.

@burdges
Copy link
Collaborator

burdges commented Aug 7, 2019

We no longer depend upon rand, only rand_core, unless you pass the rand feature.

@burdges
Copy link
Collaborator

burdges commented Aug 7, 2019

I read https://github.com/rust-random/rand/blob/master/rand_os/Cargo.toml#L25 as activating std for rand_core whenever rand_os gets used, likely causing #31 (comment)

@burdges
Copy link
Collaborator

burdges commented Aug 11, 2019

I think rand_os now avoids activating std in rand_core as of rust-random/rand#859 so that's fixed.

We cannot expect merlin to update to rand_core 0.6 anytime soon because adding RngCore::from_entropy proved to be the breaking change that broke the camels back, ala dalek-cryptography/curve25519-dalek#262 dalek-cryptography/curve25519-dalek#263 etc.

As I said in dalek-cryptography/merlin#47 (comment) we can either

Any thoughts on substrate's upgrade path for rand 0.7 @jacogr ? No rush right?

@burdges
Copy link
Collaborator

burdges commented Aug 11, 2019

I stopped the *_backend features from activating ed25519 in 33083a9 which likely contributed here, including to both @brenzi original problem and @chesterliliang more recent issue.

I explained how a wrapper shim could support merlin in rust-random/rand#865 (comment) but it seemingly either needs an extra crate or else nightly for rust-lang/cargo#4953

@burdges
Copy link
Collaborator

burdges commented Aug 13, 2019

I've a "schizomerlin" branch https://github.com/w3f/schnorrkel/tree/schizomerlin that avoids invoking the std feature of rand_core, so test that one too.

I doubt cargo feature work on stable, so if we need schizomerlin then we might factor RngCore5As4 out into a separate crate:
8150ef6

@brenzi
Copy link
Author

brenzi commented Sep 21, 2019

Re-visiting this after a while of absence.

Neither schizomerlin branch nor master work for me.

@brenzi
Copy link
Author

brenzi commented Sep 21, 2019

as schnorrkel depends on ed25519-dalek, the following might be an issue:
dalek-cryptography/ed25519-dalek#95

but strange enough, my minimal example actually works fine with no_std:
https://github.com/scs/test-no-std/tree/schnorrkel
If you build this with

cargo build --no-default-features

it actually works. But if I use schnorrkel within my real crate, it doesn't:

Reproduce:

  1. get https://github.com/scs/substraTEE-worker/tree/schnorrkel-no-std-debug
  2. cd enclave
  3. cargo build --no-default-features
    should fail.

but if you remove this use statement: https://github.com/scs/substraTEE-worker/blob/57b34bfda28dcac07749e86944d1ee17b639fbc8/primitives/src/sr25519.rs#L26

it will build

@burdges
Copy link
Collaborator

burdges commented Sep 22, 2019

I think master should avoid ed25519-dalek thanks to 33083a9 although I never published that version. We should try both that and the schizomerlin commit 8150ef6 together.

I've now forgotten if the schizomerlin commit should suffice. If rand_os activates std then we must avoid that too. You might avoid rand_os when you invoke without default features, but another options is upgrading merlin's rand_core version.

@brenzi
Copy link
Author

brenzi commented Sep 22, 2019

Schnorrkel actually works if patching ed25519-dalek to my fork:

[patch.crates-io]
ed25519-dalek = { git = "https://github.com/scs/ed25519-dalek.git", branch = "no_std_sgx"}

@burdges
Copy link
Collaborator

burdges commented Sep 22, 2019

I'd expect merlin might break rand_core without rand_core being upgrade, or at least using the schizomerlin commit 8150ef6, but maybe not.

@burdges
Copy link
Collaborator

burdges commented Nov 1, 2019

Is there no trouble from the failure crate here? We could drop support for it easily I think.

@burdges
Copy link
Collaborator

burdges commented Dec 17, 2019

Is this fixed? I'll publish some new version over Xmas hopefully, well more like January, so if not then maybe then.

@brenzi
Copy link
Author

brenzi commented Dec 18, 2019

We haven't tried with the unpatched ed25519-dalek. If it's fixed there, its fixed for us

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants