Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QLeverfile for OSM Country is in old config format #53

Open
Qup42 opened this issue Aug 4, 2024 · 6 comments
Open

QLeverfile for OSM Country is in old config format #53

Qup42 opened this issue Aug 4, 2024 · 6 comments

Comments

@Qup42
Copy link
Member

Qup42 commented Aug 4, 2024

The QLeverfile for OSM Country is still in the old format. Two options come to my mind:

  1. Convert the old QLeverfile to the new format. Problem: the cli arguments for osm2rdf are no longer supported and the replacement is not obvious.
  2. Switch to using pre-built turtle files from https://osm2rdf.cs.uni-freiburg.de/ like the osm-planet.
Preliminary config for option 1
# Qleverfile for OSM of some country, use with https://github.com/ad-freiburg/qlever-control
#
# qlever get-data  # downloads .pbf file from Geofabrik und builds .ttl.bz2 using osm2rdf
# qlever index     # for example Germany takes ~30 minutes and ~10 GB RAM (on an AMD Ryzen 9 5900X)
# qlever start     # starts the server
#
# Make sure that osm2rdf is in your path. Set CONTINENT and COUNTRY such that
# the link under GET_DATA_CMD exists (the names are usually the canonical
# names). The time for osm2rdf is around the same as that for "qlever index".

[data]
CONTINENT     = europe
COUNTRY       = switzerland
NAME          = osm-${COUNTRY}
PBF           = ${NAME}.pbf
WITH_TEXT     = false
SETTINGS_JSON = '{ "prefixes-external": [ "\"LINESTRING(", "\"MULTIPOLYGON(", "\"POLYGON(" ], "ascii-prefixes-only": false, "num-triples-per-batch": 1000000 }'
GET_DATA_CMD  = "wget -nc -O ${PBF} https://download.geofabrik.de/${CONTINENT}/${COUNTRY}-latest.osm.pbf; rm -f ${NAME}.*.bz2; ( time /home/julian/code/uni/osm2rdf/build/apps/osm2rdf ${PBF} -o ${NAME}.ttl --cache . --write-geometric-relation-statistics ) 2>&1 | tee ${NAME}.osm2rdf-log.txt; rm -f spatial-*"
VERSION       = $$(ls -l --time-style=+%d.%m.%Y ${PBF} 2> /dev/null | cut -d' ' -f6)
DESCRIPTION   = "OSM ${COUNTRY}, dump from ${VERSION} with ogc:contains"

[index]
INPUT_FILES    = "${data:NAME}.ttl.bz2"
CAT_INPUT_FILES = lbzcat -f -n 2 ${INPUT_FILES}
STXXL_MEMORY   = 10
SETTINGS_JSON  = { "prefixes-external": [ "\"LINESTRING(", "\"MULTIPOLYGON(", "\"POLYGON(" ], "ascii-prefixes-only": false, "num-triples-per-batch": 1000000 }

[server]
PORT                        = 7025
MEMORY_FOR_QUERIES          = 20G
ACCESS_TOKEN                = ${data:NAME}_1234567890
CACHE_MAX_SIZE              = 10G
CACHE_MAX_SIZE_SINGLE_ENTRY = 5G
CACHE_MAX_NUM_ENTRIES       = 100

# QLever binaries
[runtime]
SYSTEM = native
IMAGE  = adfreiburg/qlever

# QLever UI
[ui]
UI_CONFIG = osm
Preliminary config for option 2
# Qleverfile for OSM of some country, use with https://github.com/ad-freiburg/qlever-control
#
# qlever get-data  # downloads .pbf file from Geofabrik und builds .ttl.bz2 using osm2rdf
# qlever index     # for example Germany takes ~30 minutes and ~10 GB RAM (on an AMD Ryzen 9 5900X)
# qlever start     # starts the server
#
# Make sure that osm2rdf is in your path. Set CONTINENT and COUNTRY such that
# the link under GET_DATA_CMD exists (the names are usually the canonical
# names). The time for osm2rdf is around the same as that for "qlever index".

[data]
# 3 letter ISO3166 country code, available country are listed at https://osm2rdf.cs.uni-freiburg.de/
COUNTRY_CODE      = che
DATA_URL          = https://osm2rdf.cs.uni-freiburg.de/ttl/${COUNTRY_CODE}.osm.ttl.bz2
GET_DATA_CMD      = curl --location --fail --continue-at - --remote-time --output ${NAME}.ttl.bz2 ${DATA_URL}
NAME              = osm-${COUNTRY_CODE}
VERSION           = $$(ls -l --time-style=+%d.%m.%Y ${COUNTRY_CODE} 2> /dev/null | cut -d' ' -f6)
#VERSION           = $$(date -r ${NAME}.ttl.bz2 +"%d.%m.%Y")
DESCRIPTION       = OSM ${COUNTRY_CODE}, dump from ${VERSION} with ogc:contains

[index]
INPUT_FILES                 = ${data:NAME}.ttl.bz2
CAT_INPUT_FILES             = lbzcat -f -n 2 ${INPUT_FILES}
STXXL_MEMORY                = 10G
SETTINGS_JSON               = { "languages-internal": [], "prefixes-external": [""], "ascii-prefixes-only": false, "num-triples-per-batch": 5000000 }

[server]
PORT                        = 7025
MEMORY_FOR_QUERIES          = 20G
CACHE_MAX_SIZE              = 10G
CACHE_MAX_SIZE_SINGLE_ENTRY = 5G
TIMEOUT                     = 100s

[runtime]
SYSTEM                      = native
IMAGE                       = adfreiburg/qlever

[ui]
UI_CONFIG                   = osm
@Qup42
Copy link
Member Author

Qup42 commented Aug 4, 2024

@hannahbast which option do you think we should go?

@hannahbast
Copy link
Member

@Qup42 I already have a new file, which I have been using for some time, I just have to commit it. Do you need it today?

@Qup42
Copy link
Member Author

Qup42 commented Aug 4, 2024

No. I am now working with a pre-built ttl file (option 2) for my attempts to replicate the osm sparql updates.

@hannahbast
Copy link
Member

@Qup42 Which code are you using to generate the SPARQL updates?

@Qup42
Copy link
Member Author

Qup42 commented Aug 4, 2024

Nicolas' code. I wanted to have a look at it now because I'll probably be on the road without my laptop during the meeting.

@andreasalstrup
Copy link

@Qup42 I already have a new file, which I have been using for some time, I just have to commit it. Do you need it today?

@hannahbast Can I have the working config file?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants