Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] added sanparks related scripts #62

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

DumbMachine
Copy link
Contributor

No description provided.

@henrykironde
Copy link
Contributor

@DumbMachine, lets us add a WIP on this PR. There are a lot of things that we need to change with the PR.

  • We shall add all the files into one.
  • We shall wait for a new PR that I am working on to get merged so that you can easily work with this encoded data.( I will let you know)

@DumbMachine DumbMachine changed the title added sanparks related scripts [WIP] added sanparks related scripts Mar 3, 2020
@henrykironde
Copy link
Contributor

The auto create tool is up and ready. Please use it with encoding option -e latin-1

@DumbMachine
Copy link
Contributor Author

The problem with this dataset was:
weecology/retriever#1421 (comment)

I was trying to create the script for this, and I think the metadata for sanparks_karoo_karoo2008.txt is erroneous. I read the karoo2008.txt file with separator as \t.
We get some unnamed columns:
image
If we try to force the names:
image
The problem arises because each row has 9 values, while the header only has 7 values

@henrykironde
Copy link
Contributor

I request the you make this one file. You can do that by downloading all the data into the same folder and use retriever autocreate -d path-to dir Let me know what you think

@DumbMachine
Copy link
Contributor Author

@henrykironde running retriever autocreate -d temp (where temp is the directory where I downloaded the files. gives this file
temp.txt

@DumbMachine
Copy link
Contributor Author

Currently there seems to be problems as it:

  • ignores peggym.109.1-GGHNPcensusdata.txt
  • get the columns wrong for peggym.112.1-GGHNPTotals

@henrykironde
Copy link
Contributor

This file does not have headers. You need to put the headers before you run autocreate on the file

@henrykironde
Copy link
Contributor

Run peggym.109.1-GGHNPcensusdata.txt this file alone and just copy it's resources to the final file, this is because other files use "," and this file uses tabs

@DumbMachine
Copy link
Contributor Author

@henrykironde got the scripts working.
image
The problem right now is that after downloading each file from the respective url, the file is named metacat for some reason. I have written the appropriate names in the name field of json file

            "name": "GGHNPTotals.txt",

Can you tell me what is the reason for this?

@henrykironde
Copy link
Contributor

Where do you get that ? "name": "GGHNPTotals.txt",
Please download and install the current master. I think your current installation or retriever is old. The current master works fine for me

@DumbMachine
Copy link
Contributor Author

Where do you get that ? "name": "GGHNPTotals.txt"

Got It from the download server

Please download and install the current master. I think your current installation or retriever is old. The current master works fine for me

I'll install the current master

@henrykironde
Copy link
Contributor

Yes after installing and re running retriever autocreate -e latinn-1 -d and retriever autocreate -e latin-1 -f on tab files, add all the resources to one script and let me know

@henrykironde
Copy link
Contributor

I think we should combine these to form a single script.

@DumbMachine
Copy link
Contributor Author

DumbMachine commented Apr 1, 2020

I tested the scripts a few days ago and they were working. To test the scripts again, we will have to wait as the website seems to be down.

@ha0ye
Copy link
Member

ha0ye commented Apr 23, 2020

Per the comment from @mbjones, I think the links should be updated to point to the version on DataOne:
weecology/retriever#1421 (comment)

@DumbMachine Let me know if you would like any assistance with this.

Base automatically changed from master to main February 11, 2021 09:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants