Skip to content

Latest commit

 

History

History
55 lines (28 loc) · 1.61 KB

README.md

File metadata and controls

55 lines (28 loc) · 1.61 KB

crawl_bioRxiv2

Summarise the number of word in each section of submitted articles on bioRxiv.

After data cleaning, a total of 42,348 submitted papers on bioRxiv were analyzed here (before Oct 15, 2019).

Summary of word count in each section

  1. ABSTRACT

[Bule vertical dashed lines indicate integer numbers from 150 to 400 with step = 50. Clear peaks were showed in these vertical lines.]

  • It seems many authors were trying to delete some words to meet the criteria of journals before submitted.*
  1. INTRODUCTION

  1. METHOD

  1. RESULT

  1. DISCUSSION

  1. Number of REFERENCE

  1. Put all section together

[x-aixs was truncated at 50000]

Correlation among each section

Relationship between REFERENCE and each section

Using mutilple linear regression, all sections expect ABSTRACT had impacts on the number of REFERENCE.As expected, the length of DISCUSSION has the largest impact on the number of REFERENCE.

DATA

https://github.com/Yiguan/crawl_bioRxiv2/blob/master/bioData_clean.txt