Skip to content
This repository has been archived by the owner on Feb 19, 2022. It is now read-only.

HTML tag stopwords #78

Open
moltude opened this issue Aug 2, 2013 · 7 comments
Open

HTML tag stopwords #78

moltude opened this issue Aug 2, 2013 · 7 comments

Comments

@moltude
Copy link
Contributor

moltude commented Aug 2, 2013

Site accepts html but only searches for tags. They need to be in the stopwords file or someway of recognizing these tags and ignoring them

@ghost ghost assigned erose Aug 2, 2013
@rlskoeser
Copy link
Contributor

better approach: attempt to recognize html/xml and load with something like beautifulsoup to get text-only content

  • would be interesting to try with ead/tei

@moltude
Copy link
Contributor Author

moltude commented Aug 3, 2013

Ah ha! Python libraries that do what needs to be done // still learning

@mialondon
Copy link
Contributor

@moltude do you have the test notes from when you and @amrys were trying out her RefWorks (? might have been EndNote etc) to use as a reference for this?

And any TEI examples would be useful too.

@amrys
Copy link
Contributor

amrys commented Oct 21, 2013

I'm attaching the screenshot of what happened when I initially gave the
machine my BibTeX library, if that helps.

--a.

On Thu, Oct 17, 2013 at 8:47 PM, Mia notifications@github.com wrote:

@moltude https://github.com/moltude do you have the test notes from
when you and @amrys https://github.com/amrys were trying out her
RefWorks (? might have been EndNote etc) to use as a reference for this?

And any TEI examples would be useful too.


Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-26565010
.

@rlskoeser
Copy link
Contributor

@amrys I'm not seeing a screenshot. You might have to use the GitHub web interface (not sure if you can add attachments via email).

@amrys
Copy link
Contributor

amrys commented Oct 23, 2013

Roger that. I will try to remember my GitHub login after I take care of
lecture tomorrow.

a.

On Wed, Oct 23, 2013 at 5:11 PM, Rebecca Sutton Koeser <
notifications@github.com> wrote:

@amrys https://github.com/amrys I'm not seeing a screenshot. You might
have to use the GitHub web interface (not sure if you can add attachments
via email).


Reply to this email directly or view it on GitHubhttps://github.com//issues/78#issuecomment-26946007
.

@amrys
Copy link
Contributor

amrys commented Oct 30, 2013

Hi folks,

Sorry for the delay -- finally got around to sorting out my GitHub password (something that apparently feel straight out of my brain when I was in Finland). I've attached the image here.

a.
bibtex-problems

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants