Repo created

This commit is contained in:
Fr4nz D13trich 2025-11-22 13:58:55 +01:00
parent 4af19165ec
commit 68073add76
12458 changed files with 12350765 additions and 2 deletions

View file

@ -0,0 +1,68 @@
Edit the rclone conf secret for Codeberg Actions, to deliver maps to i.e. /var/www/html/maps/251231 via a limited user.
apt update
apt install nginx vim
### set hostname for ssh sanity (will show in console upon next bash launch):
vim /etc/hostname
hostname cdn-XX-1
### for SSL:
sudo snap install --classic certbot
sudo certbot --nginx
### remove IPs from logging on line ~36:
vim /etc/nginx/nginx.conf
```
##
# Logging Settings
##
log_format comaps '0.0.0.0 - - [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent"';
access_log /var/log/nginx/access.log comaps;
```
### set up monitoring:
apt install goaccess
edit /etc/goaccess/goaccess.conf and uncomment time-format %H:%M:%S, date-format %Y-%m-%d, log-format COMBINED
vim /etc/crontab
`*/5 * * * * root /usr/bin/goaccess /var/log/nginx/access.log -o /var/www/html/monitor.html`
### set up basic http pages/responses:
cd /var/www/html/
mkdir maps
rm index.nginx-debian.html
wget https://www.comaps.app/favicon.ico
vim robots.txt
```
User-agent: *
Disallow: /
```
vim index.html
```
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>CoMaps CDN</title>
</head>
<body>
<h1>This is a CDN for <a href="https://comaps.app">CoMaps</a></h1>
<h2>Resources:</h2>
<ol>
<li>CoMaps <a href="https://cdn.comaps.app/subway/">subway validator</a></li>
<li>CoMaps <a href="https://comaps.app/news/">News</a></li>
<li><a href="https://comaps.app/donate/">Donate</a></li>
</ol>
</body>
</html>
```

View file

@ -0,0 +1,20 @@
# French National Library Archiving
The library has taken an interest in archiving CoMaps and its data as a snapshot
of our world and the way people interact with maps, in a way that doesn't rely on
maintaining servers etc. (With an APK and MWM files and some copy-paste, you can
reproduce our app on an emulator etc.)
## Instructions
Every 6 months or so, @jeanbaptisteC may ask to upload the most recent map version
and a custom APK with bundled World map (googleRelease) with production keys (like web release).
Credentials for `frlibrary` are in the mapgen rclone, or in zyphlar/pastk's password managers.
To upload (modify dates accordingly):
```
rclone copy CoMaps-25110702-google-release.apk frlibrary:/apk/
rclone copy 251104 frlibrary:/maps/251104
```

View file

@ -0,0 +1,186 @@
# maps_generator
`maps_generator` is the Python CLI for generating `.mwm` maps for the CoMaps. This tool functions as the driver for the `generator_tool` C++ executable.
**Use the `generator_tool` and application from the same release. The application does not support
maps built by a generator_tool newer than the app.**
## What are maps?
Maps are `.mwm` binary files with special meta-information for rendering, searching, routing, and other use cases.
Files from [data/borders](https://codeberg.org/comaps/comaps/src/branch/main/data/borders) define map boundaries for each individual file. The world is segmented into separate files by these boundaries, with the intent of having manageably small files to download. These files are referred to as *maps* or *countries*. A country is referring to one of these files, not necessarily a geographic country. Also note that there are two special countries called *World* and *WorldCoasts*. These are small simplified maps of the world and coastlines (sea and ocean watercover) used when other maps have not yet been downloaded.
## Setup
You must have Python version >= 3.7 and complete the following steps:
1. Switch to the branch of your app's version (see the note of #maps_generator section). E.g.:
```sh
git checkout 2023.06.04-13-android
```
The app version can be found in the "About" section of CoMaps.
2. Build the `generator_tool` binary (run from the root of the repo):
```sh
./tools/unix/build_omim.sh -r generator_tool
./tools/unix/build_omim.sh -r world_roads_builder_tool
./tools/unix/build_omim.sh -r mwm_diff_tool
```
3. Go to the `python` directory:
```sh
cd tools/python/
```
4. Install python dependencies:
```sh
pip install -r maps_generator/requirements_dev.txt
```
5. Create a [configuration file with defaults](https://codeberg.org/comaps/comaps/src/branch/main/tools/python/maps_generator/var/etc/map_generator.ini.default):
```sh
cp maps_generator/var/etc/map_generator.ini.default maps_generator/var/etc/map_generator.ini
```
6. Read through and edit the configuration file.
Ensure that `OMIM_PATH` is set correctly.
The default `PLANET_URL` setting makes the generator to download an OpenStreetMap dump file for the North Macedonia from [Geofabrik](http://download.geofabrik.de/index.html). Change `PLANET_URL` and `PLANET_MD5_URL` to get a region you want.
## Basic Usage
Make sure you are in the `tools/python` repo directory for starting the generator.
```sh
cd tools/python
```
Build a `.mwm` map file for North Macedonia without using coastlines (it's a land-locked country anyway):
```sh
python3 -m maps_generator --countries="Macedonia" --skip="Coastline"
```
It's possible to skip coastlines for countries that have a sea coast too, but the sea water will not be rendered in that case.
Make sure that you specify country names that are actually contained in your pbf file, or you'll get errors in the next step. Check the filenames in the `data/borders/` folder (without the `.poly` extension) for a list of all valid country names. For example, New York City is in `US_New York_New_York` and all of England (minus Ireland, Scotland, and Wales) can be generated by specifying `UK_England_*`.
To see other possible command-line options:
```sh
python3 -m maps_generator -h
```
## Troubleshooting
The general log file (by default its `maps_build/generation.log`) contains output of the `maps_generator` python script only. More detailed logs that include output of the `generator_tool` binary are located in the `logs/` subdir of a particular build directory, e.g. `maps_build/2023_06_04__20_05_07/logs/`.
## More Examples
### Japan with coastlines
1. Open https://download.geofabrik.de/asia/japan.html and copy url of osm.pbf and md5sum files.
2. Put the urls into the `PLANET_URL` and `PLANET_MD5_URL` settings of the `map_generator.ini` file.
3. Set `PLANET_COASTS_URL` to a location with `latest_coasts.geom` and `latest_coasts.rawgeom` files. You don't need to download these files if the whole planet is built. They are generated in the process of building the whole planet (the coastline should be valid and continuous for it to succeed).
4. Run
```sh
python3 -m maps_generator --countries="World, WorldCoasts, Japan_*"
```
### Rebuild stages
For example, you changed routing code in the project and want to regenerate maps.
You must have previous generation. You may regenerate starting from the routing stage and only for two mwms:
```sh
python3 -m maps_generator -c --from_stage="Routing" --countries="Japan_Kinki Region_Osaka_Osaka, Japan_Chugoku Region_Tottori"
```
### Custom maps from GeoJSON
If you have an OSM PBF file and want to cut custom map regions, you can use a polygon feature in a GeoJSON file. This is a useful alternative if you want a custom area, or you do not want to figure out which countrie(s) apply to the area you need.
1. If you don't already have the .osm.pbf file, download applicable area of the world in .osm.pbf format, for example from [Geofabrik](http://download.geofabrik.de/index.html).
2. Generate area in geojson format of the territory in which you are interested. You can do it via [geojson.io](http://geojson.io/). Select the area on the map and copy corresponding part of the resulting geojson. You need to copy the contents of the `features: [ { ... } ]`, without features array, but with inner braces: `{...}`. For example, here is the full geojson of the rectangle area around Melbourne:
```json
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"properties": {},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[143.75610351562497, -39.21523130910491],
[147.98583984375, -39.21523130910491],
[147.98583984375, -36.03133177633187],
[143.75610351562497, -36.03133177633187],
[143.75610351562497, -39.21523130910491]
]
]
}
}
]
}
```
You need to copy this part of the geojson:
```json
{
"type": "Feature",
"properties": {},
"geometry": {
"type": "Polygon",
"coordinates": [
[
[143.75610351562497, -39.21523130910491],
[147.98583984375, -39.21523130910491],
[147.98583984375, -36.03133177633187],
[143.75610351562497, -36.03133177633187],
[143.75610351562497, -39.21523130910491]
]
]
}
}
```
3. Save selected geojson in some file with .geojson extension. For example, `borders.geojson`.
4. Extract this area from .osm.pbf file with the help of [osmium tool](https://osmcode.org/osmium-tool/):
```
osmium extract -p borders.geojson germany-latest.osm.pbf -o germany_part.osm.pbf
```
5. Run the `maps_generator` tool:
```sh
python3 -m maps_generator --skip="Coastline" --without_countries="World*"
```
In this example we skipped generation of the World\* files because they are ones of the most time- and resources-consuming mwms.
### Subways layer
You can manually generate a subway layer file to use in the `SUBWAY_URL` ini setting. See [instructions](https://codeberg.org/comaps/comaps/src/branch/main/docs/SUBWAY_GENERATION.md).
## Testing maps
If you're testing a new feature you likely wish to test the maps locally
### iOS
The easiest is to use the Simulator and switch out the map file in the Documents folder
Finding the folder is slight tricky, the easiest is to look in the Xcode debug message window, as it often prints messages that contain the Documents folder
E.g.,
```
I(1) 0.11666 platform/string_storage_base.cpp:24 StringStorageBase(): Settings path: /Users/<user-name>/Library/Developer/CoreSimulator/Devices/EFE74BF2-2871-4364-A633-BC8F1BAB9DF3/data/Containers/Data/Application/252BDFA5-3E60-43A6-B09C-158BC55DC450/Documents/settings.ini
```
In this folder the map file is in a YYMMDD subfolder

View file

@ -0,0 +1,19 @@
import os
from maps_generator.generator import settings
CONFIG_PATH = os.path.join(
os.path.dirname(os.path.join(os.path.realpath(__file__))),
"var",
"etc",
"map_generator.ini",
)
print(f"Loading configuration from {CONFIG_PATH}")
settings.init(CONFIG_PATH)
from maps_generator.generator import stages_declaration
from maps_generator.generator.stages import stages
stages.init()

View file

@ -0,0 +1,251 @@
import logging
import os
from argparse import ArgumentParser
from argparse import RawDescriptionHelpFormatter
from maps_generator.generator import settings
from maps_generator.generator import stages
from maps_generator.generator import stages_declaration as sd
from maps_generator.generator.env import Env
from maps_generator.generator.env import PathProvider
from maps_generator.generator.env import WORLDS_NAMES
from maps_generator.generator.env import find_last_build_dir
from maps_generator.generator.env import get_all_countries_list
from maps_generator.generator.exceptions import ContinueError
from maps_generator.generator.exceptions import SkipError
from maps_generator.generator.exceptions import ValidationError
from maps_generator.maps_generator import generate_coasts
from maps_generator.maps_generator import generate_maps
from maps_generator.utils.algo import unique
logger = logging.getLogger("maps_generator")
def parse_options():
parser = ArgumentParser(
description="A tool to generate map files in Organic Maps' .mwm format.",
epilog="See maps_generator/README.md for setup instructions and usage examples.",
formatter_class=RawDescriptionHelpFormatter,
parents=[settings.parser],
)
parser.add_argument(
"-c",
"--continue",
default="",
nargs="?",
type=str,
help="Continue the last build or the one specified in CONTINUE from the "
"last stopped stage.",
)
parser.add_argument(
"-s",
"--suffix",
default="",
type=str,
help="Suffix of the name of a build directory.",
)
parser.add_argument(
"--countries",
type=str,
default="",
help="List of countries/regions, separated by a comma or a semicolon, or a path to "
"a file with a newline-separated list of regions, for which maps "
"should be built. Filenames in data/borders/ (without the .poly extension) "
"represent all valid region names. "
"A * wildcard is accepted, e.g. --countries=\"UK*\" will match "
"UK_England_East Midlands, UK_England_East of England_Essex, etc.",
)
parser.add_argument(
"--without_countries",
type=str,
default="",
help="List of countries/regions to exclude from generation. "
"Has a priority over --countries and uses the same syntax.",
)
parser.add_argument(
"--skip",
type=str,
default="",
help=f"List of stages, separated by a comma or a semicolon, "
f"for which building will be skipped. Available skip stages: "
f"{', '.join([s.replace('stage_', '') for s in stages.stages.get_visible_stages_names()])}.",
)
parser.add_argument(
"--from_stage",
type=str,
default="",
help=f"Stage from which maps will be rebuild. Available stages: "
f"{', '.join([s.replace('stage_', '') for s in stages.stages.get_visible_stages_names()])}.",
)
parser.add_argument(
"--coasts",
default=False,
action="store_true",
help="Build only WorldCoasts.raw and WorldCoasts.rawgeom files.",
)
parser.add_argument(
"--force_download_files",
default=False,
action="store_true",
help="If build is continued, files will always be downloaded again.",
)
parser.add_argument(
"--production",
default=False,
action="store_true",
help="Build production maps. Otherwise 'OSM-data-only maps' are built "
"without additional data like SRTM.",
)
parser.add_argument(
"--order",
type=str,
default=os.path.join(
os.path.dirname(os.path.abspath(__file__)),
"var/etc/file_generation_order.txt",
),
help="Mwm generation order, useful to have particular maps completed first "
"in a long build (defaults to maps_generator/var/etc/file_generation_order.txt "
"to process big countries first).",
)
return parser.parse_args()
def main():
root = logging.getLogger()
root.addHandler(logging.NullHandler())
options = parse_options()
# Processing of 'continue' option.
# If 'continue' is set maps generation is continued from the last build
# that is found automatically.
build_name = None
continue_ = getattr(options, "continue")
if continue_ is None or continue_:
d = find_last_build_dir(continue_)
if d is None:
raise ContinueError(
"The build cannot continue: the last build directory was not found."
)
build_name = d
countries_line = ""
without_countries_line = ""
if "COUNTRIES" in os.environ:
countries_line = os.environ["COUNTRIES"]
if options.countries:
countries_line = options.countries
else:
countries_line = "*"
if options.without_countries:
without_countries_line = options.without_countries
all_countries = get_all_countries_list(PathProvider.borders_path())
def end_star_compare(prefix, full):
return full.startswith(prefix)
def compare(a, b):
return a == b
def get_countries_set_from_line(line):
countries = []
used_countries = set()
countries_list = []
if os.path.isfile(line):
with open(line) as f:
countries_list = [x.strip() for x in f]
elif line:
countries_list = [x.strip() for x in line.replace(";", ",").split(",")]
for country_item in countries_list:
cmp = compare
_raw_country = country_item[:]
if _raw_country and _raw_country[-1] == "*":
_raw_country = _raw_country.replace("*", "")
cmp = end_star_compare
for country in all_countries:
if cmp(_raw_country, country):
used_countries.add(country_item)
countries.append(country)
countries = unique(countries)
diff = set(countries_list) - used_countries
if diff:
raise ValidationError(f"Bad input countries: {', '.join(diff)}")
return set(countries)
countries = get_countries_set_from_line(countries_line)
without_countries = get_countries_set_from_line(without_countries_line)
countries -= without_countries
countries = list(countries)
if not countries:
countries = all_countries
# Processing of 'order' option.
# It defines an order of countries generation using a file from 'order' path.
if options.order:
ordered_countries = []
countries = set(countries)
with open(options.order) as file:
for c in file:
if c.strip().startswith("#"):
continue
c = c.split("\t")[0].strip()
if c in countries:
ordered_countries.append(c)
countries.remove(c)
if countries:
raise ValueError(
f"{options.order} does not have an order " f"for {countries}."
)
countries = ordered_countries
# Processing of 'skip' option.
skipped_stages = set()
if options.skip:
for s in options.skip.replace(";", ",").split(","):
stage = s.strip()
if not stages.stages.is_valid_stage_name(stage):
raise SkipError(f"{stage} not found.")
skipped_stages.add(stages.get_stage_type(stage))
if settings.PLANET_URL != settings.DEFAULT_PLANET_URL:
skipped_stages.add(sd.StageUpdatePlanet)
if sd.StageCoastline in skipped_stages:
if any(x in WORLDS_NAMES for x in options.countries):
raise SkipError(
f"You can not skip {stages.get_stage_name(sd.StageCoastline)}"
f" if you want to generate {WORLDS_NAMES}."
f" You can exclude them with --without_countries option."
)
if not settings.NEED_PLANET_UPDATE:
skipped_stages.add(sd.StageUpdatePlanet)
if not settings.NEED_BUILD_WORLD_ROADS:
skipped_stages.add(sd.StagePrepareRoutingWorld)
skipped_stages.add(sd.StageRoutingWorld)
# Make env and run maps generation.
env = Env(
countries=countries,
production=options.production,
build_name=build_name,
build_suffix=options.suffix,
skipped_stages=skipped_stages,
force_download_files=options.force_download_files
)
from_stage = None
if options.from_stage:
from_stage = f"{options.from_stage}"
if options.coasts:
generate_coasts(env, from_stage)
else:
generate_maps(env, from_stage)
env.finish()
main()

View file

@ -0,0 +1,60 @@
import argparse
import sys
from maps_generator.checks.default_check_set import CheckType
from maps_generator.checks.default_check_set import LogsChecks
from maps_generator.checks.default_check_set import get_logs_check_sets_and_filters
from maps_generator.checks.default_check_set import run_checks_and_print_results
def get_args():
parser = argparse.ArgumentParser(
description="This script checks maps generation logs and prints results."
)
parser.add_argument(
"--old", type=str, required=True, help="Path to old logs directory.",
)
parser.add_argument(
"--new", type=str, required=True, help="Path to new logs directory.",
)
parser.add_argument(
"--checks",
action="store",
type=str,
nargs="*",
default=None,
help=f"Set of checks: {', '.join(c.name for c in LogsChecks)}. "
f"By default, all checks will run.",
)
parser.add_argument(
"--level",
type=str,
required=False,
choices=("low", "medium", "hard", "strict"),
default="medium",
help="Messages level.",
)
parser.add_argument(
"--output",
type=str,
required=False,
default="",
help="Path to output file. stdout by default.",
)
return parser.parse_args()
def main():
args = get_args()
checks = {LogsChecks[c] for c in args.checks} if args.checks is not None else None
s = get_logs_check_sets_and_filters(args.old, args.new, checks)
run_checks_and_print_results(
s,
CheckType[args.level],
file=open(args.output, "w") if args.output else sys.stdout,
)
if __name__ == "__main__":
main()

View file

@ -0,0 +1,65 @@
import argparse
import sys
from maps_generator.checks.default_check_set import CheckType
from maps_generator.checks.default_check_set import MwmsChecks
from maps_generator.checks.default_check_set import get_mwm_check_sets_and_filters
from maps_generator.checks.default_check_set import run_checks_and_print_results
def get_args():
parser = argparse.ArgumentParser(
description="This script checks mwms and prints results."
)
parser.add_argument(
"--old", type=str, required=True, help="Path to old mwm directory.",
)
parser.add_argument(
"--new", type=str, required=True, help="Path to new mwm directory.",
)
parser.add_argument(
"--categories", type=str, required=True, help="Path to categories file.",
)
parser.add_argument(
"--checks",
action="store",
type=str,
nargs="*",
default=None,
help=f"Set of checks: {', '.join(c.name for c in MwmsChecks)}. "
f"By default, all checks will run.",
)
parser.add_argument(
"--level",
type=str,
required=False,
choices=("low", "medium", "hard", "strict"),
default="medium",
help="Messages level.",
)
parser.add_argument(
"--output",
type=str,
required=False,
default="",
help="Path to output file. stdout by default.",
)
return parser.parse_args()
def main():
args = get_args()
checks = {MwmsChecks[c] for c in args.checks} if args.checks else None
s = get_mwm_check_sets_and_filters(
args.old, args.new, checks, categories_path=args.categories
)
run_checks_and_print_results(
s,
CheckType[args.level],
file=open(args.output, "w") if args.output else sys.stdout,
)
if __name__ == "__main__":
main()

View file

@ -0,0 +1,309 @@
import os
import sys
from abc import ABC
from abc import abstractmethod
from collections import namedtuple
from enum import Enum
from functools import lru_cache
from typing import Any
from typing import Callable
from typing import List
ResLine = namedtuple("ResLine", ["previous", "current", "diff", "arrow"])
class Arrow(Enum):
zero = 0
down = 1
up = 2
ROW_TO_STR = {
Arrow.zero: "◄►",
Arrow.down: "",
Arrow.up: "",
}
def norm(value):
if isinstance(value, (int, float)):
return abs(value)
elif hasattr(value, "__len__"):
return len(value)
elif hasattr(value, "norm"):
return value.norm()
assert False, type(value)
def get_rel(r: ResLine) -> bool:
rel = 0.0
if r.arrow != Arrow.zero:
prev = norm(r.previous)
if prev == 0:
rel = 100.0
else:
rel = norm(r.diff) * 100.0 / prev
return rel
class Check(ABC):
"""
Base class for any checks.
Usual flow:
# Create check object.
check = AnyCheck("ExampleCheck")
# Do work.
check.check()
# Get results and process them
raw_result = check.get_result()
process_result(raw_result)
# or print result
check.print()
"""
def __init__(self, name: str):
self.name = name
def print(self, silent_if_no_results=False, filt=None, file=sys.stdout):
s = self.formatted_string(silent_if_no_results, filt)
if s:
print(s, file=file)
@abstractmethod
def check(self):
"""
Performs a logic of the check.
"""
pass
@abstractmethod
def get_result(self) -> Any:
"""
Returns a raw result of the check.
"""
pass
@abstractmethod
def formatted_string(self, silent_if_no_results=False, *args, **kwargs) -> str:
"""
Returns a formatted string of a raw result of the check.
"""
pass
class CompareCheckBase(Check, ABC):
def __init__(self, name: str):
super().__init__(name)
self.op: Callable[
[Any, Any], Any
] = lambda previous, current: current - previous
self.do: Callable[[Any], Any] = lambda x: x
self.zero: Any = 0
self.diff_format: Callable[[Any], str] = lambda x: str(x)
self.format: Callable[[Any], str] = lambda x: str(x)
self.filt: Callable[[Any], bool] = lambda x: True
def set_op(self, op: Callable[[Any, Any], Any]):
self.op = op
def set_do(self, do: Callable[[Any], Any]):
self.do = do
def set_zero(self, zero: Any):
self.zero = zero
def set_diff_format(self, diff_format: Callable[[Any], str]):
self.diff_format = diff_format
def set_format(self, format: Callable[[Any], str]):
self.format = format
def set_filt(self, filt: Callable[[Any], bool]):
self.filt = filt
class CompareCheck(CompareCheckBase):
def __init__(
self, name: str, old: Any, new: Any,
):
super().__init__(name)
self.old = old
self.new = new
self.result = None
def set_op(self, op: Callable[[Any, Any], Any]):
self.op = op
def set_do(self, do: Callable[[Any], Any]):
self.do = do
def set_zero(self, zero: Any):
self.zero = zero
def get_result(self) -> ResLine:
return self.result
def check(self):
previous = self.do(self.old)
if previous is None:
return False
current = self.do(self.new)
if current is None:
return False
diff = self.op(previous, current)
if diff is None:
return False
arrow = Arrow.zero
if diff > self.zero:
arrow = Arrow.up
elif diff < self.zero:
arrow = Arrow.down
self.result = ResLine(
previous=previous, current=current, diff=diff, arrow=arrow
)
return True
def formatted_string(self, silent_if_no_results=False, *args, **kwargs) -> str:
assert self.result
if silent_if_no_results and self.result.arrow == Arrow.zero:
return ""
rel = get_rel(self.result)
return (
f"{self.name}: {ROW_TO_STR[self.result.arrow]} {rel:.2f}% "
f"[{self.format(self.result.previous)}"
f"{self.format(self.result.current)}: "
f"{self.diff_format(self.result.diff)}]"
)
class CompareCheckSet(CompareCheckBase):
def __init__(self, name: str):
super().__init__(name)
self.checks = []
def add_check(self, check: Check):
self.checks.append(check)
def set_op(self, op: Callable[[Any, Any], Any]):
for c in self.checks:
c.set_op(op)
def set_do(self, do: Callable[[Any], Any]):
for c in self.checks:
c.set_do(do)
def set_zero(self, zero: Any):
for c in self.checks:
c.set_zero(zero)
def set_diff_format(self, diff_format: Callable[[Any], str]):
for c in self.checks:
c.set_diff_format(diff_format)
def set_format(self, format: Callable[[Any], str]):
for c in self.checks:
c.set_format(format)
def check(self):
for c in self.checks:
c.check()
def get_result(self,) -> List[ResLine]:
return [c.get_result() for c in self._with_result()]
def formatted_string(self, silent_if_no_results=False, filt=None, _offset=0) -> str:
sets = filter(lambda c: isinstance(c, CompareCheckSet), self._with_result())
checks = filter(lambda c: isinstance(c, CompareCheck), self._with_result())
checks = sorted(checks, key=lambda c: norm(c.get_result().diff), reverse=True)
if filt is None:
filt = self.filt
checks = filter(lambda c: filt(c.get_result()), checks)
sets = list(sets)
checks = list(checks)
no_results = not checks and not sets
if silent_if_no_results and no_results:
return ""
head = [
f"{' ' * _offset}Check set[{self.name}]:",
]
lines = []
if no_results:
lines.append(f"{' ' * (_offset + 2)}No results.")
for c in checks:
s = c.formatted_string(silent_if_no_results, filt, _offset + 2)
if s:
lines.append(f"{' ' * (_offset + 2)}{s}")
for s in sets:
s = s.formatted_string(silent_if_no_results, filt, _offset + 2)
if s:
lines.append(s)
if not lines:
return ""
head += lines
return "\n".join(head) + "\n"
def _with_result(self):
return (c for c in self.checks if c.get_result() is not None)
@lru_cache(maxsize=None)
def _get_and_check_files(old_path, new_path, ext):
files = list(filter(lambda f: f.endswith(ext), os.listdir(old_path)))
s = set(files) ^ set(filter(lambda f: f.endswith(ext), os.listdir(new_path)))
assert len(s) == 0, s
return files
def build_check_set_for_files(
name: str,
old_path: str,
new_path: str,
*,
ext: str = "",
recursive: bool = False,
op: Callable[[Any, Any], Any] = lambda previous, current: current - previous,
do: Callable[[Any], Any] = lambda x: x,
zero: Any = 0,
diff_format: Callable[[Any], str] = lambda x: str(x),
format: Callable[[Any], str] = lambda x: str(x),
):
if recursive:
raise NotImplementedError(
f"CheckSetBuilderForFiles is not implemented for recursive."
)
cs = CompareCheckSet(name)
for file in _get_and_check_files(old_path, new_path, ext):
cs.add_check(
CompareCheck(
file, os.path.join(old_path, file), os.path.join(new_path, file)
)
)
cs.set_do(do)
cs.set_op(op)
cs.set_zero(zero)
cs.set_diff_format(diff_format)
cs.set_format(format)
return cs

View file

@ -0,0 +1,34 @@
import re
from maps_generator.checks import check
from maps_generator.checks.logs import logs_reader
ADDR_PATTERN = re.compile(
r".*BuildAddressTable\(\) Address: "
r"Matched percent (?P<matched_percent>[0-9.]+) "
r"Total: (?P<total>\d+) "
r"Missing: (?P<missing>\d+)"
)
def get_addresses_check_set(old_path: str, new_path: str) -> check.CompareCheckSet:
"""
Returns an addresses check set, that checks a difference in 'matched_percent'
addresses of BuildAddressTable between old logs and new logs.
"""
def do(path: str):
log = logs_reader.Log(path)
if not log.is_mwm_log:
return None
found = logs_reader.find_and_parse(log.lines, ADDR_PATTERN)
if not found:
return None
d = found[0][0]
return float(d["matched_percent"])
return check.build_check_set_for_files(
"Addresses check", old_path, new_path, ext=".log", do=do
)

View file

@ -0,0 +1,58 @@
from collections import defaultdict
from maps_generator.checks import check
from maps_generator.checks.check_mwm_types import count_all_types
from mwm import NAME_TO_INDEX_TYPE_MAPPING
def parse_groups(path):
groups = defaultdict(set)
with open(path) as f:
for line in f:
line = line.strip()
if line.startswith("#"):
continue
if line.startswith("@"):
continue
array = line.split("@", maxsplit=1)
if len(array) == 2:
types_str, categories = array
types_int = {
NAME_TO_INDEX_TYPE_MAPPING[t]
for t in types_str.strip("|").split("|")
}
for category in categories.split("|"):
category = category.replace("@", "", 1)
groups[category].update(types_int)
return groups
def get_categories_check_set(
old_path: str, new_path: str, categories_path: str
) -> check.CompareCheckSet:
"""
Returns a categories check set, that checks a difference in a number of
objects of categories(from categories.txt) between old mwms and new mwms.
"""
cs = check.CompareCheckSet("Categories check")
def make_do(indexes):
def do(path):
all_types = count_all_types(path)
return sum(all_types[i] for i in indexes)
return do
for category, types in parse_groups(categories_path).items():
cs.add_check(
check.build_check_set_for_files(
f"Category {category} check",
old_path,
new_path,
ext=".mwm",
do=make_do(types),
)
)
return cs

View file

@ -0,0 +1,49 @@
import logging
from functools import lru_cache
from maps_generator.checks import check
from maps_generator.checks.logs import logs_reader
from maps_generator.generator.stages_declaration import stages
@lru_cache(maxsize=None)
def _get_log_stages(path):
log = logs_reader.Log(path)
return logs_reader.normalize_logs(logs_reader.split_into_stages(log))
def get_log_levels_check_set(old_path: str, new_path: str) -> check.CompareCheckSet:
"""
Returns a log levels check set, that checks a difference in a number of
message levels from warning and higher for each stage between old mwms
and new mwms.
"""
cs = check.CompareCheckSet("Log levels check")
def make_do(level, stage_name, cache={}):
def do(path):
for s in _get_log_stages(path):
if s.name == stage_name:
k = f"{path}:{stage_name}"
if k not in cache:
cache[k] = logs_reader.count_levels(s)
return cache[k][level]
return None
return do
for stage_name in (
stages.get_visible_stages_names() + stages.get_invisible_stages_names()
):
for level in (logging.CRITICAL, logging.ERROR, logging.WARNING):
cs.add_check(
check.build_check_set_for_files(
f"Stage {stage_name} - {logging.getLevelName(level)} check",
old_path,
new_path,
ext=".log",
do=make_do(level, stage_name),
)
)
return cs

View file

@ -0,0 +1,61 @@
from collections import defaultdict
from functools import lru_cache
from typing import Union
from maps_generator.checks import check
from mwm import Mwm
from mwm import NAME_TO_INDEX_TYPE_MAPPING
from mwm import readable_type
from mwm import type_index
@lru_cache(maxsize=None)
def count_all_types(path: str):
c = defaultdict(int)
for ft in Mwm(path, parse=False):
for t in ft.types():
c[t] += 1
return c
def get_mwm_type_check_set(
old_path: str, new_path: str, type_: Union[str, int]
) -> check.CompareCheckSet:
"""
Returns a mwm type check set, that checks a difference in a number of
type [type_] between old mwms and new mwms.
"""
if isinstance(type_, str):
type_ = type_index(type_)
assert type_ >= 0, type_
return check.build_check_set_for_files(
f"Types check [{readable_type(type_)}]",
old_path,
new_path,
ext=".mwm",
do=lambda path: count_all_types(path)[type_],
)
def get_mwm_types_check_set(old_path: str, new_path: str) -> check.CompareCheckSet:
"""
Returns a mwm types check set, that checks a difference in a number of
each type between old mwms and new mwms.
"""
cs = check.CompareCheckSet("Mwm types check")
def make_do(index):
return lambda path: count_all_types(path)[index]
for t_name, t_index in NAME_TO_INDEX_TYPE_MAPPING.items():
cs.add_check(
check.build_check_set_for_files(
f"Type {t_name} check",
old_path,
new_path,
ext=".mwm",
do=make_do(t_index),
)
)
return cs

View file

@ -0,0 +1,124 @@
import os
from functools import lru_cache
from maps_generator.checks import check
from mwm import Mwm
class SectionNames:
def __init__(self, sections):
self.sections = sections
def __sub__(self, other):
return SectionNames(
{k: self.sections[k] for k in set(self.sections) - set(other.sections)}
)
def __lt__(self, other):
if isinstance(other, int):
return len(self.sections) < other
elif isinstance(other, SectionNames):
return self.sections < other.sections
assert False, type(other)
def __gt__(self, other):
if isinstance(other, int):
return len(self.sections) > other
elif isinstance(other, SectionNames):
return self.sections > other.sections
assert False, type(other)
def __len__(self):
return len(self.sections)
def __str__(self):
return str(self.sections)
@lru_cache(maxsize=None)
def read_sections(path: str):
return Mwm(path, parse=False).sections_info()
def get_appeared_sections_check_set(
old_path: str, new_path: str
) -> check.CompareCheckSet:
return check.build_check_set_for_files(
f"Appeared sections check",
old_path,
new_path,
ext=".mwm",
do=lambda path: SectionNames(read_sections(path)),
diff_format=lambda s: ", ".join(f"{k}:{v.size}" for k, v in s.sections.items()),
format=lambda s: f"number of sections: {len(s.sections)}",
)
def get_disappeared_sections_check_set(
old_path: str, new_path: str
) -> check.CompareCheckSet:
return check.build_check_set_for_files(
f"Disappeared sections check",
old_path,
new_path,
ext=".mwm",
do=lambda path: SectionNames(read_sections(path)),
op=lambda previous, current: previous - current,
diff_format=lambda s: ", ".join(f"{k}:{v.size}" for k, v in s.sections.items()),
format=lambda s: f"number of sections: {len(s.sections)}",
)
def get_sections_existence_check_set(
old_path: str, new_path: str
) -> check.CompareCheckSet:
"""
Returns a sections existence check set, that checks appeared and
disappeared sections between old mwms and new mwms.
"""
cs = check.CompareCheckSet("Sections existence check")
cs.add_check(get_appeared_sections_check_set(old_path, new_path))
cs.add_check(get_disappeared_sections_check_set(old_path, new_path))
return cs
def _get_sections_set(path):
sections = set()
for file in os.listdir(path):
p = os.path.join(path, file)
if os.path.isfile(p) and file.endswith(".mwm"):
sections.update(read_sections(p).keys())
return sections
def get_sections_size_check_set(old_path: str, new_path: str) -> check.CompareCheckSet:
"""
Returns a sections size check set, that checks a difference in a size
of each sections of mwm between old mwms and new mwms.
"""
sections_set = _get_sections_set(old_path)
sections_set.update(_get_sections_set(new_path))
cs = check.CompareCheckSet("Sections size check")
def make_do(section):
def do(path):
sections = read_sections(path)
if section not in sections:
return None
return sections[section].size
return do
for section in sections_set:
cs.add_check(
check.build_check_set_for_files(
f"Size of {section} check",
old_path,
new_path,
ext=".mwm",
do=make_do(section),
)
)
return cs

View file

@ -0,0 +1,17 @@
import os
from maps_generator.checks import check
def get_size_check_set(old_path: str, new_path: str) -> check.CompareCheckSet:
"""
Returns a size check set, that checks a difference in a size of mwm between
old mwms and new mwms.
"""
return check.build_check_set_for_files(
"Size check",
old_path,
new_path,
ext=".mwm",
do=lambda path: os.path.getsize(path),
)

View file

@ -0,0 +1,167 @@
import sys
from collections import namedtuple
from enum import Enum
from typing import Callable
from typing import Mapping
from typing import Optional
from typing import Set
from typing import Tuple
from maps_generator.checks import check
from maps_generator.checks.check_addresses import get_addresses_check_set
from maps_generator.checks.check_categories import get_categories_check_set
from maps_generator.checks.check_log_levels import get_log_levels_check_set
from maps_generator.checks.check_mwm_types import get_mwm_type_check_set
from maps_generator.checks.check_mwm_types import get_mwm_types_check_set
from maps_generator.checks.check_sections import get_sections_existence_check_set
from maps_generator.checks.check_sections import get_sections_size_check_set
from maps_generator.checks.check_size import get_size_check_set
class CheckType(Enum):
low = 1
medium = 2
hard = 3
strict = 4
Threshold = namedtuple("Threshold", ["abs", "rel"])
_default_thresholds = {
CheckType.low: Threshold(abs=20, rel=20),
CheckType.medium: Threshold(abs=15, rel=15),
CheckType.hard: Threshold(abs=10, rel=10),
CheckType.strict: Threshold(abs=0, rel=0),
}
def set_thresholds(check_type_map: Mapping[CheckType, Threshold]):
global _default_thresholds
_default_thresholds = check_type_map
def make_tmap(
low: Optional[Tuple[float, float]] = None,
medium: Optional[Tuple[float, float]] = None,
hard: Optional[Tuple[float, float]] = None,
strict: Optional[Tuple[float, float]] = None,
):
thresholds = _default_thresholds.copy()
if low is not None:
thresholds[CheckType.low] = Threshold(*low)
if medium is not None:
thresholds[CheckType.medium] = Threshold(*medium)
if hard is not None:
thresholds[CheckType.hard] = Threshold(*hard)
if strict is not None:
thresholds[CheckType.strict] = Threshold(*strict)
return thresholds
def make_default_filter(check_type_map: Mapping[CheckType, Threshold] = None):
if check_type_map is None:
check_type_map = _default_thresholds
def maker(check_type: CheckType):
threshold = check_type_map[check_type]
def default_filter(r: check.ResLine):
return (
check.norm(r.diff) > threshold.abs and check.get_rel(r) > threshold.rel
)
return default_filter
return maker
class MwmsChecks(Enum):
sections_existence = 1
sections_size = 2
mwm_size = 3
types = 4
booking = 5
categories = 6
def get_mwm_check_sets_and_filters(
old_path: str, new_path: str, checks: Set[MwmsChecks] = None, **kwargs
) -> Mapping[check.Check, Callable]:
def need_add(t: MwmsChecks):
return checks is None or t in checks
m = {get_sections_existence_check_set(old_path, new_path): None}
if need_add(MwmsChecks.sections_size):
c = get_sections_size_check_set(old_path, new_path)
thresholds = make_tmap(low=(0, 20), medium=(0, 10), hard=(0, 5))
m[c] = make_default_filter(thresholds)
mb = 1 << 20
if need_add(MwmsChecks.mwm_size):
c = get_size_check_set(old_path, new_path)
thresholds = make_tmap(low=(2 * mb, 10), medium=(mb, 5), hard=(0.5 * mb, 2))
m[c] = make_default_filter(thresholds)
if need_add(MwmsChecks.types):
c = get_mwm_types_check_set(old_path, new_path)
thresholds = make_tmap(low=(500, 30), medium=(100, 20), hard=(100, 10))
m[c] = make_default_filter(thresholds)
if need_add(MwmsChecks.booking):
c = get_mwm_type_check_set(old_path, new_path, "sponsored-booking")
thresholds = make_tmap(low=(500, 20), medium=(50, 10), hard=(50, 5))
m[c] = make_default_filter(thresholds)
if need_add(MwmsChecks.categories):
c = get_categories_check_set(old_path, new_path, kwargs["categories_path"])
thresholds = make_tmap(low=(200, 20), medium=(50, 10), hard=(50, 5))
m[c] = make_default_filter(thresholds)
return m
class LogsChecks(Enum):
log_levels = 1
addresses = 2
def get_logs_check_sets_and_filters(
old_path: str, new_path: str, checks: Set[LogsChecks] = None
) -> Mapping[check.Check, Callable]:
def need_add(t: LogsChecks):
return checks is None or t in checks
m = {get_log_levels_check_set(old_path, new_path): None}
if need_add(LogsChecks.addresses):
c = get_addresses_check_set(old_path, new_path)
thresholds = make_tmap(low=(50, 20), medium=(20, 10), hard=(10, 5))
m[c] = make_default_filter(thresholds)
return m
def _print_header(file, header, width=100, s="="):
stars = s * ((width - len(header)) // 2)
rstars = stars
if 2 * len(stars) + len(header) < width:
rstars += s
print(stars, header, rstars, file=file)
def run_checks_and_print_results(
checks: Mapping[check.Check, Callable],
check_type: CheckType,
silent_if_no_results: bool = True,
file=sys.stdout,
):
for check, make_filt in checks.items():
check.check()
_print_header(file, check.name)
check.print(
silent_if_no_results=silent_if_no_results,
filt=None if make_filt is None else make_filt(check_type),
file=file,
)

View file

@ -0,0 +1,241 @@
import datetime
import logging
import os
import re
from collections import Counter
from collections import namedtuple
from enum import Enum
from pathlib import Path
from typing import List
from typing import Tuple
from typing import Union
import maps_generator.generator.env as env
from maps_generator.generator.stages import get_stage_type
from maps_generator.utils.algo import parse_timedelta
logger = logging.getLogger(__name__)
FLAGS = re.MULTILINE | re.DOTALL
GEN_LINE_PATTERN = re.compile(
r"^LOG\s+TID\((?P<tid>\d+)\)\s+(?P<level>[A-Z]+)\s+"
r"(?P<timestamp>[-.e0-9]+)\s+(?P<message>.+)$",
FLAGS,
)
GEN_LINE_CHECK_PATTERN = re.compile(
r"^TID\((?P<tid>\d+)\)\s+" r"ASSERT FAILED\s+(?P<message>.+)$", FLAGS
)
MAPS_GEN_LINE_PATTERN = re.compile(
r"^\[(?P<time_string>[0-9-:, ]+)\]\s+(?P<level>\w+)\s+"
r"(?P<module>\w+)\s+(?P<message>.+)$",
FLAGS,
)
STAGE_START_MSG_PATTERN = re.compile(r"^Stage (?P<name>\w+): start ...$")
STAGE_FINISH_MSG_PATTERN = re.compile(
r"^Stage (?P<name>\w+): finished in (?P<duration_string>.+)$"
)
LogLine = namedtuple("LogLine", ["timestamp", "level", "tid", "message", "type"])
LogStage = namedtuple("LogStage", ["name", "duration", "lines"])
class LogType(Enum):
gen = 1
maps_gen = 2
class Log:
def __init__(self, path: str):
self.path = Path(path)
self.name = self.path.stem
self.is_stage_log = False
self.is_mwm_log = False
try:
get_stage_type(self.name)
self.is_stage_log = True
except AttributeError:
if self.name in env.COUNTRIES_NAMES or self.name in env.WORLDS_NAMES:
self.is_mwm_log = True
self.lines = self._parse_lines()
def _parse_lines(self) -> List[LogLine]:
logline = ""
state = None
lines = []
base_timestamp = 0.0
def try_parse_and_insert():
nonlocal logline
logline = logline.strip()
if not logline:
return
nonlocal base_timestamp
line = None
if state == LogType.gen:
line = Log._parse_gen_line(logline, base_timestamp)
elif state == LogType.maps_gen:
line = Log._parse_maps_gen_line(logline)
base_timestamp = line.timestamp
if line is not None:
lines.append(line)
else:
logger.warn(f"{self.name}: line was not parsed: {logline}")
logline = ""
with self.path.open() as logfile:
for line in logfile:
if line.startswith("LOG") or line.startswith("TID"):
try_parse_and_insert()
state = LogType.gen
elif line.startswith("["):
try_parse_and_insert()
state = LogType.maps_gen
logline += line
try_parse_and_insert()
return lines
@staticmethod
def _parse_gen_line(line: str, base_time: float = 0.0) -> LogLine:
m = GEN_LINE_PATTERN.match(line)
if m:
return LogLine(
timestamp=base_time + float(m["timestamp"]),
level=logging.getLevelName(m["level"]),
tid=int(m["tid"]),
message=m["message"],
type=LogType.gen,
)
m = GEN_LINE_CHECK_PATTERN.match(line)
if m:
return LogLine(
timestamp=None,
level=logging.getLevelName("CRITICAL"),
tid=None,
message=m["message"],
type=LogType.gen,
)
assert False, line
@staticmethod
def _parse_maps_gen_line(line: str) -> LogLine:
m = MAPS_GEN_LINE_PATTERN.match(line)
time_string = m["time_string"].split(",")[0]
timestamp = datetime.datetime.strptime(
time_string, logging.Formatter.default_time_format
).timestamp()
if m:
return LogLine(
timestamp=float(timestamp),
level=logging.getLevelName(m["level"]),
tid=None,
message=m["message"],
type=LogType.maps_gen,
)
assert False, line
class LogsReader:
def __init__(self, path: str):
self.path = os.path.abspath(os.path.expanduser(path))
def __iter__(self):
for filename in os.listdir(self.path):
if filename.endswith(".log"):
yield Log(os.path.join(self.path, filename))
def split_into_stages(log: Log) -> List[LogStage]:
log_stages = []
name = None
lines = []
for line in log.lines:
if line.message.startswith("Stage"):
m = STAGE_START_MSG_PATTERN.match(line.message)
if m:
if name is not None:
logger.warn(f"{log.name}: stage {name} has not finish line.")
log_stages.append(LogStage(name=name, duration=None, lines=lines))
name = m["name"]
m = STAGE_FINISH_MSG_PATTERN.match(line.message)
if m:
# assert name == m["name"], line
duration = parse_timedelta(m["duration_string"])
log_stages.append(LogStage(name=name, duration=duration, lines=lines))
name = None
lines = []
else:
lines.append(line)
if name is not None:
logger.warn(f"{log.name}: stage {name} has not finish line.")
log_stages.append(LogStage(name=name, duration=None, lines=lines))
return log_stages
def _is_worse(lhs: LogStage, rhs: LogStage) -> bool:
if (lhs.duration is None) ^ (rhs.duration is None):
return lhs.duration is None
if len(rhs.lines) > len(lhs.lines):
return True
return rhs.duration > lhs.duration
def normalize_logs(llogs: List[LogStage]) -> List[LogStage]:
normalized_logs = []
buckets = {}
for log in llogs:
if log.name in buckets:
if _is_worse(normalized_logs[buckets[log.name]], log):
normalized_logs[buckets[log.name]] = log
else:
normalized_logs.append(log)
buckets[log.name] = len(normalized_logs) - 1
return normalized_logs
def count_levels(logs: Union[List[LogLine], LogStage]) -> Counter:
if isinstance(logs, list):
return Counter(log.level for log in logs)
if isinstance(logs, LogStage):
return count_levels(logs.lines)
assert False, f"Type {type(logs)} is unsupported."
def find_and_parse(
logs: Union[List[LogLine], LogStage], pattern: Union[str, type(re.compile(""))],
) -> List[Tuple[dict, str]]:
if isinstance(pattern, str):
pattern = re.compile(pattern, FLAGS)
if isinstance(logs, list):
found = []
for log in logs:
m = pattern.match(log.message)
if m:
found.append((m.groupdict(), log))
return found
if isinstance(logs, LogStage):
return find_and_parse(logs.lines, pattern)
assert False, f"Type {type(logs)} is unsupported."

View file

@ -0,0 +1,37 @@
import argparse
from maps_generator.generator.statistics import diff
from maps_generator.generator.statistics import read_types
def get_args():
parser = argparse.ArgumentParser(
description="This script prints the difference between old_stats.json and new_stats.json."
)
parser.add_argument(
"--old",
default="",
type=str,
required=True,
help="Path to old file with map generation statistics.",
)
parser.add_argument(
"--new",
default="",
type=str,
required=True,
help="Path to new file with map generation statistics.",
)
return parser.parse_args()
def main():
args = get_args()
old = read_types(args.old)
new = read_types(args.new)
for line in diff(new, old):
print(";".join(str(x) for x in line))
if __name__ == "__main__":
main()

View file

@ -0,0 +1,68 @@
"""
This file contains api for osmfilter and generator_tool to generate coastline.
"""
import os
import subprocess
from maps_generator.generator import settings
from maps_generator.generator.env import Env
from maps_generator.generator.gen_tool import run_gen_tool
from maps_generator.generator.osmtools import osmfilter
def filter_coastline(
name_executable,
in_file,
out_file,
output=subprocess.DEVNULL,
error=subprocess.DEVNULL,
):
osmfilter(
name_executable,
in_file,
out_file,
output=output,
error=error,
keep="",
keep_ways="natural=coastline",
keep_nodes="capital=yes place=town =city",
)
def make_coastline(env: Env):
coastline_o5m = os.path.join(env.paths.coastline_path, "coastline.o5m")
filter_coastline(
env[settings.OSM_TOOL_FILTER],
env.paths.planet_o5m,
coastline_o5m,
output=env.get_subprocess_out(),
error=env.get_subprocess_out(),
)
run_gen_tool(
env.gen_tool,
out=env.get_subprocess_out(),
err=env.get_subprocess_out(),
data_path=env.paths.data_path,
intermediate_data_path=env.paths.coastline_path,
osm_file_type="o5m",
osm_file_name=coastline_o5m,
node_storage=env.node_storage,
user_resource_path=env.paths.user_resource_path,
preprocess=True,
)
run_gen_tool(
env.gen_tool,
out=env.get_subprocess_out(),
err=env.get_subprocess_out(),
data_path=env.paths.data_path,
intermediate_data_path=env.paths.coastline_path,
osm_file_type="o5m",
osm_file_name=coastline_o5m,
node_storage=env.node_storage,
user_resource_path=env.paths.user_resource_path,
make_coasts=True,
fail_on_coasts=True,
threads_count=settings.THREADS_COUNT,
)

View file

@ -0,0 +1,100 @@
from pathlib import Path
import subprocess
import warnings
class Status:
NO_NEW_VERSION = "Failed: new version doesn't exist: {new}"
INTERNAL_ERROR = "Failed: internal error (C++ module) while calculating"
NO_OLD_VERSION = "Skipped: old version doesn't exist: {old}"
NOTHING_TO_DO = "Skipped: output already exists: {out}"
OK = "Succeeded: calculated {out}: {diff_size} out of {new_size} bytes"
TOO_LARGE = "Cancelled: {out}: diff {diff_size} > new version {new_size}"
@classmethod
def is_error(cls, status):
return status == cls.NO_NEW_VERSION or status == cls.INTERNAL_ERROR
def calculate_diff(params):
diff_tool, new, old, out = params["tool"], params["new"], params["old"], params["out"]
if not new.exists():
return Status.NO_NEW_VERSION, params
if not old.exists():
return Status.NO_OLD_VERSION, params
status = Status.OK
if out.exists():
status = Status.NOTHING_TO_DO
else:
res = subprocess.run([diff_tool.as_posix(), "make", old, new, out])
if res.returncode != 0:
return Status.INTERNAL_ERROR, params
diff_size = out.stat().st_size
new_size = new.stat().st_size
if diff_size > new_size:
status = Status.TOO_LARGE
params.update({
"diff_size": diff_size,
"new_size": new_size
})
return status, params
def mwm_diff_calculation(data_dir, logger, depth):
data = list(data_dir.get_mwms())[:depth]
results = map(calculate_diff, data)
for status, params in results:
if Status.is_error(status):
raise Exception(status.format(**params))
logger.info(status.format(**params))
class DataDir(object):
def __init__(self, diff_tool, mwm_name, new_version_dir, old_version_root_dir):
self.diff_tool_path = Path(diff_tool)
self.mwm_name = mwm_name
self.diff_name = self.mwm_name + ".mwmdiff"
self.new_version_dir = Path(new_version_dir)
self.new_version_path = Path(new_version_dir, mwm_name)
self.old_version_root_dir = Path(old_version_root_dir)
def get_mwms(self):
old_versions = sorted(
self.old_version_root_dir.glob("[0-9]*"),
reverse=True
)
for old_version_dir in old_versions:
if (old_version_dir != self.new_version_dir and
old_version_dir.is_dir()):
diff_dir = Path(self.new_version_dir, old_version_dir.name)
diff_dir.mkdir(exist_ok=True)
yield {
"tool": self.diff_tool_path,
"new": self.new_version_path,
"old": Path(old_version_dir, self.mwm_name),
"out": Path(diff_dir, self.diff_name)
}
if __name__ == "__main__":
import logging
import sys
logger = logging.getLogger()
logger.addHandler(logging.StreamHandler(stream=sys.stdout))
logger.setLevel(logging.DEBUG)
data_dir = DataDir(
mwm_name=sys.argv[1], new_version_dir=sys.argv[2],
old_version_root_dir=sys.argv[3],
)
mwm_diff_calculation(data_dir, logger, depth=1)

View file

@ -0,0 +1,582 @@
import collections
import datetime
import logging
import logging.config
import os
import shutil
import sys
from functools import wraps
from typing import Any
from typing import AnyStr
from typing import Callable
from typing import Dict
from typing import List
from typing import Optional
from typing import Set
from typing import Type
from typing import Union
from maps_generator.generator import settings
from maps_generator.generator import status
from maps_generator.generator.osmtools import build_osmtools
from maps_generator.generator.stages import Stage
from maps_generator.utils.file import find_executable
from maps_generator.utils.file import is_executable
from maps_generator.utils.file import make_symlink
logger = logging.getLogger("maps_generator")
WORLD_NAME = "World"
WORLD_COASTS_NAME = "WorldCoasts"
WORLDS_NAMES = {WORLD_NAME, WORLD_COASTS_NAME}
def get_all_countries_list(borders_path: AnyStr) -> List[AnyStr]:
"""Returns all countries including World and WorldCoasts."""
return [
f.replace(".poly", "")
for f in os.listdir(borders_path)
if os.path.isfile(os.path.join(borders_path, f))
] + list(WORLDS_NAMES)
def create_if_not_exist_path(path: AnyStr) -> bool:
"""Creates directory if it doesn't exist."""
try:
os.makedirs(path)
logger.info(f"Create {path} ...")
return True
except FileExistsError:
return False
def create_if_not_exist(func: Callable[..., AnyStr]) -> Callable[..., AnyStr]:
"""
It's a decorator, that wraps func in create_if_not_exist_path,
that returns a path.
"""
@wraps(func)
def wrapper(*args, **kwargs):
path = func(*args, **kwargs)
create_if_not_exist_path(path)
return path
return wrapper
class Version:
"""It's used for writing and reading a generation version."""
@staticmethod
def write(out_path: AnyStr, version: AnyStr):
with open(os.path.join(out_path, settings.VERSION_FILE_NAME), "w") as f:
f.write(str(version))
@staticmethod
def read(version_path: AnyStr) -> int:
with open(version_path) as f:
line = f.readline().strip()
try:
return int(line)
except ValueError:
logger.exception(f"Cast '{line}' to int error.")
return 0
def find_last_build_dir(hint: Optional[AnyStr] = None) -> Optional[AnyStr]:
"""
It tries to find a last generation directory. If it's found function
returns path of last generation directory. Otherwise returns None.
"""
if hint is not None:
p = os.path.join(settings.MAIN_OUT_PATH, hint)
return hint if os.path.exists(p) else None
try:
paths = [
os.path.join(settings.MAIN_OUT_PATH, f)
for f in os.listdir(settings.MAIN_OUT_PATH)
]
except FileNotFoundError:
logger.exception(f"{settings.MAIN_OUT_PATH} not found.")
return None
versions = []
for path in paths:
version_path = os.path.join(path, settings.VERSION_FILE_NAME)
if not os.path.isfile(version_path):
versions.append(0)
else:
versions.append(Version.read(version_path))
pairs = sorted(zip(paths, versions), key=lambda p: p[1], reverse=True)
return None if not pairs or pairs[0][1] == 0 else pairs[0][0].split(os.sep)[-1]
class PathProvider:
"""
PathProvider is used for building paths for a maps generation.
"""
def __init__(self, build_path: AnyStr, build_name:AnyStr, mwm_version: AnyStr):
self.build_path = build_path
self.build_name = build_name
self.mwm_version = mwm_version
create_if_not_exist_path(self.build_path)
@property
@create_if_not_exist
def intermediate_data_path(self) -> AnyStr:
"""
intermediate_data_path contains intermediate files,
for example downloaded external files, that are needed for generation,
*.mwm.tmp files, etc.
"""
return os.path.join(self.build_path, "intermediate_data")
@property
@create_if_not_exist
def cache_path(self) -> AnyStr:
"""cache_path contains caches for nodes, ways, relations."""
if not settings.CACHE_PATH:
return self.intermediate_data_path
return os.path.join(settings.CACHE_PATH, self.build_name)
@property
@create_if_not_exist
def data_path(self) -> AnyStr:
"""It's a synonym for intermediate_data_path."""
return self.intermediate_data_path
@property
@create_if_not_exist
def intermediate_tmp_path(self) -> AnyStr:
"""intermediate_tmp_path contains *.mwm.tmp files."""
return os.path.join(self.intermediate_data_path, "tmp")
@property
@create_if_not_exist
def mwm_path(self) -> AnyStr:
"""mwm_path contains *.mwm files."""
return os.path.join(self.build_path, self.mwm_version)
@property
@create_if_not_exist
def log_path(self) -> AnyStr:
"""mwm_path log files."""
return os.path.join(self.build_path, "logs")
@property
@create_if_not_exist
def generation_borders_path(self) -> AnyStr:
"""
generation_borders_path contains *.poly files, that define
which .mwm files are generated.
"""
return os.path.join(self.intermediate_data_path, "borders")
@property
@create_if_not_exist
def draft_path(self) -> AnyStr:
"""draft_path is used for saving temporary intermediate files."""
return os.path.join(self.build_path, "draft")
@property
@create_if_not_exist
def osm2ft_path(self) -> AnyStr:
"""osm2ft_path contains osmId<->ftId mappings."""
return os.path.join(self.build_path, "osm2ft")
@property
@create_if_not_exist
def coastline_path(self) -> AnyStr:
"""coastline_path is used for a coastline generation."""
return os.path.join(self.intermediate_data_path, "coasts")
@property
@create_if_not_exist
def coastline_tmp_path(self) -> AnyStr:
"""coastline_tmp_path is used for a coastline generation."""
return os.path.join(self.coastline_path, "tmp")
@property
@create_if_not_exist
def status_path(self) -> AnyStr:
"""status_path contains status files."""
return os.path.join(self.build_path, "status")
@property
@create_if_not_exist
def descriptions_path(self) -> AnyStr:
return os.path.join(self.intermediate_data_path, "descriptions")
@property
@create_if_not_exist
def stats_path(self) -> AnyStr:
return os.path.join(self.build_path, "stats")
@property
@create_if_not_exist
def transit_path(self) -> AnyStr:
return self.intermediate_data_path
@property
def transit_path_experimental(self) -> AnyStr:
return (
os.path.join(self.intermediate_data_path, "transit_from_gtfs")
if settings.TRANSIT_URL
else ""
)
@property
def world_roads_path(self) -> AnyStr:
return (
os.path.join(self.intermediate_data_path, "world_roads.txt")
if settings.NEED_BUILD_WORLD_ROADS
else ""
)
@property
def planet_osm_pbf(self) -> AnyStr:
return os.path.join(self.build_path, f"{settings.PLANET}.osm.pbf")
@property
def planet_o5m(self) -> AnyStr:
return os.path.join(self.build_path, f"{settings.PLANET}.o5m")
@property
def world_roads_o5m(self) -> AnyStr:
return os.path.join(self.build_path, "world_roads.o5m")
@property
def main_status_path(self) -> AnyStr:
return os.path.join(self.status_path, status.with_stat_ext("stages"))
@property
def packed_polygons_path(self) -> AnyStr:
return os.path.join(self.mwm_path, "packed_polygons.bin")
@property
def localads_path(self) -> AnyStr:
return os.path.join(self.build_path, f"localads_{self.mwm_version}")
@property
def types_path(self) -> AnyStr:
return os.path.join(self.user_resource_path, "types.txt")
@property
def external_resources_path(self) -> AnyStr:
return os.path.join(self.mwm_path, "external_resources.txt")
@property
def id_to_wikidata_path(self) -> AnyStr:
return os.path.join(self.intermediate_data_path, "id_to_wikidata.csv")
@property
def wiki_url_path(self) -> AnyStr:
return os.path.join(self.intermediate_data_path, "wiki_urls.txt")
@property
def ugc_path(self) -> AnyStr:
return os.path.join(self.intermediate_data_path, "ugc_db.sqlite3")
@property
def hotels_path(self) -> AnyStr:
return os.path.join(self.intermediate_data_path, "hotels.csv")
@property
def promo_catalog_cities_path(self) -> AnyStr:
return os.path.join(self.intermediate_data_path, "promo_catalog_cities.json")
@property
def promo_catalog_countries_path(self) -> AnyStr:
return os.path.join(self.intermediate_data_path, "promo_catalog_countries.json")
@property
def popularity_path(self) -> AnyStr:
return os.path.join(self.intermediate_data_path, "popular_places.csv")
@property
def subway_path(self) -> AnyStr:
return os.path.join(
self.intermediate_data_path, "mapsme_osm_subways.transit.json"
)
@property
def food_paths(self) -> AnyStr:
return os.path.join(self.intermediate_data_path, "ids_food.json")
@property
def food_translations_path(self) -> AnyStr:
return os.path.join(self.intermediate_data_path, "translations_food.json")
@property
def cities_boundaries_path(self) -> AnyStr:
return os.path.join(self.intermediate_data_path, "cities_boundaries.bin")
@property
def hierarchy_path(self) -> AnyStr:
return os.path.join(self.user_resource_path, "hierarchy.txt")
@property
def old_to_new_path(self) -> AnyStr:
return os.path.join(self.user_resource_path, "old_vs_new.csv")
@property
def borders_to_osm_path(self) -> AnyStr:
return os.path.join(self.user_resource_path, "borders_vs_osm.csv")
@property
def countries_synonyms_path(self) -> AnyStr:
return os.path.join(self.user_resource_path, "countries_synonyms.csv")
@property
def counties_txt_path(self) -> AnyStr:
return os.path.join(self.mwm_path, "countries.txt")
@property
def user_resource_path(self) -> AnyStr:
return settings.USER_RESOURCE_PATH
@staticmethod
def srtm_path() -> AnyStr:
return settings.SRTM_PATH
@staticmethod
def isolines_path() -> AnyStr:
return settings.ISOLINES_PATH
@staticmethod
def addresses_path() -> AnyStr:
return settings.ADDRESSES_PATH
@staticmethod
def borders_path() -> AnyStr:
return os.path.join(settings.USER_RESOURCE_PATH, "borders")
@staticmethod
@create_if_not_exist
def tmp_dir():
return settings.TMPDIR
COUNTRIES_NAMES = set(get_all_countries_list(PathProvider.borders_path()))
class Env:
"""
Env provides a generation environment. It sets up instruments and paths,
that are used for a maps generation. It stores state of the maps generation.
"""
def __init__(
self,
countries: Optional[List[AnyStr]] = None,
production: bool = False,
build_name: Optional[AnyStr] = None,
build_suffix: AnyStr = "",
skipped_stages: Optional[Set[Type[Stage]]] = None,
force_download_files: bool = False,
):
self.setup_logging()
logger.info("Start setup ...")
os.environ["TMPDIR"] = PathProvider.tmp_dir()
for k, v in self.setup_osm_tools().items():
setattr(self, k, v)
self.production = production
self.force_download_files = force_download_files
self.countries = countries
self.skipped_stages = set() if skipped_stages is None else skipped_stages
if self.countries is None:
self.countries = get_all_countries_list(PathProvider.borders_path())
self.node_storage = settings.NODE_STORAGE
version_format = "%Y_%m_%d__%H_%M_%S"
suffix_div = "-"
self.dt = None
if build_name is None:
self.dt = datetime.datetime.now()
build_name = self.dt.strftime(version_format)
if build_suffix:
build_name = f"{build_name}{suffix_div}{build_suffix}"
else:
s = build_name.split(suffix_div, maxsplit=1)
if len(s) == 1:
s.append("")
date_str, build_suffix = s
self.dt = datetime.datetime.strptime(date_str, version_format)
self.build_suffix = build_suffix
self.mwm_version = self.dt.strftime("%y%m%d")
self.planet_version = self.dt.strftime("%s")
self.build_path = os.path.join(settings.MAIN_OUT_PATH, build_name)
self.build_name = build_name
self.gen_tool = self.setup_generator_tool()
if WORLD_NAME in self.countries:
self.world_roads_builder_tool = self.setup_world_roads_builder_tool()
self.diff_tool = self.setup_mwm_diff_tool()
logger.info(f"Build name is {self.build_name}.")
logger.info(f"Build path is {self.build_path}.")
self.paths = PathProvider(self.build_path, self.build_name, self.mwm_version)
Version.write(self.build_path, self.planet_version)
self.setup_borders()
self.setup_osm2ft()
if self.force_download_files:
for item in os.listdir(self.paths.status_path):
if item.endswith(".download"):
os.remove(os.path.join(self.paths.status_path, item))
self.main_status = status.Status()
# self.countries_meta stores log files and statuses for each country.
self.countries_meta = collections.defaultdict(dict)
self.subprocess_out = None
self.subprocess_countries_out = {}
printed_countries = ", ".join(self.countries)
if len(self.countries) > 50:
printed_countries = (
f"{', '.join(self.countries[:25])}, ..., "
f"{', '.join(self.countries[-25:])}"
)
logger.info(
f"The following {len(self.countries)} maps will build: "
f"{printed_countries}."
)
logger.info("Finish setup")
def __getitem__(self, item):
return self.__dict__[item]
def get_tmp_mwm_names(self) -> List[AnyStr]:
tmp_ext = ".mwm.tmp"
existing_names = set()
for f in os.listdir(self.paths.intermediate_tmp_path):
path = os.path.join(self.paths.intermediate_tmp_path, f)
if f.endswith(tmp_ext) and os.path.isfile(path):
name = f.replace(tmp_ext, "")
if name in self.countries:
existing_names.add(name)
return [c for c in self.countries if c in existing_names]
def add_skipped_stage(self, stage: Union[Type[Stage], Stage]):
if isinstance(stage, Stage):
stage = stage.__class__
self.skipped_stages.add(stage)
def is_accepted_stage(self, stage: Union[Type[Stage], Stage]) -> bool:
if isinstance(stage, Stage):
stage = stage.__class__
return stage not in self.skipped_stages
def finish(self):
self.main_status.finish()
def finish_mwm(self, mwm_name: AnyStr):
self.countries_meta[mwm_name]["status"].finish()
def set_subprocess_out(self, subprocess_out: Any, country: Optional[AnyStr] = None):
if country is None:
self.subprocess_out = subprocess_out
else:
self.subprocess_countries_out[country] = subprocess_out
def get_subprocess_out(self, country: Optional[AnyStr] = None):
if country is None:
return self.subprocess_out
else:
return self.subprocess_countries_out[country]
@staticmethod
def setup_logging():
def exception_handler(type, value, tb):
logger.exception(
f"Uncaught exception: {str(value)}", exc_info=(type, value, tb)
)
logging.config.dictConfig(settings.LOGGING)
sys.excepthook = exception_handler
@staticmethod
def setup_generator_tool() -> AnyStr:
logger.info("Check generator tool ...")
exceptions = []
for gen_tool in settings.POSSIBLE_GEN_TOOL_NAMES:
gen_tool_path = shutil.which(gen_tool)
if gen_tool_path is None:
logger.info(f"Looking for generator tool in {settings.BUILD_PATH} ...")
try:
gen_tool_path = find_executable(settings.BUILD_PATH, gen_tool)
except FileNotFoundError as e:
exceptions.append(e)
continue
logger.info(f"Generator tool found - {gen_tool_path}")
return gen_tool_path
raise Exception(exceptions)
@staticmethod
def setup_world_roads_builder_tool() -> AnyStr:
logger.info(f"Check world_roads_builder_tool. Looking for it in {settings.BUILD_PATH} ...")
world_roads_builder_tool_path = find_executable(settings.BUILD_PATH, "world_roads_builder_tool")
logger.info(f"world_roads_builder_tool found - {world_roads_builder_tool_path}")
return world_roads_builder_tool_path
@staticmethod
def setup_mwm_diff_tool() -> AnyStr:
logger.info(f"Check mwm_diff_tool. Looking for it in {settings.BUILD_PATH} ...")
mwm_diff_tool_path = find_executable(settings.BUILD_PATH, "mwm_diff_tool")
logger.info(f"mwm_diff_tool found - {mwm_diff_tool_path}")
return mwm_diff_tool_path
@staticmethod
def setup_osm_tools() -> Dict[AnyStr, AnyStr]:
path = settings.OSM_TOOLS_PATH
osm_tool_names = [
settings.OSM_TOOL_CONVERT,
settings.OSM_TOOL_UPDATE,
settings.OSM_TOOL_FILTER,
]
logger.info("Check for the osmctools binaries...")
# Check in the configured path first.
tmp_paths = [os.path.join(path, t) for t in osm_tool_names]
if not all([is_executable(t) for t in tmp_paths]):
# Or use a system-wide installation.
tmp_paths = [shutil.which(t) for t in osm_tool_names]
if all([is_executable(t) for t in tmp_paths]):
osm_tool_paths = dict(zip(osm_tool_names, tmp_paths))
logger.info(f"Found osmctools at {', '.join(osm_tool_paths.values())}")
return osm_tool_paths
logger.info(f"osmctools are not found, building from the sources into {path}...")
os.makedirs(path, exist_ok=True)
return build_osmtools(settings.OSM_TOOLS_SRC_PATH)
def setup_borders(self):
temp_borders = self.paths.generation_borders_path
borders = PathProvider.borders_path()
for x in self.countries:
if x in WORLDS_NAMES:
continue
poly = f"{x}.poly"
make_symlink(os.path.join(borders, poly), os.path.join(temp_borders, poly))
make_symlink(temp_borders, os.path.join(self.paths.draft_path, "borders"))
def setup_osm2ft(self):
for x in os.listdir(self.paths.osm2ft_path):
p = os.path.join(self.paths.osm2ft_path, x)
if os.path.isfile(p) and x.endswith(".mwm.osm2ft"):
shutil.move(p, os.path.join(self.paths.mwm_path, x))

View file

@ -0,0 +1,58 @@
import os
import subprocess
class MapsGeneratorError(Exception):
pass
class OptionNotFound(MapsGeneratorError):
pass
class ValidationError(MapsGeneratorError):
pass
class ContinueError(MapsGeneratorError):
pass
class SkipError(MapsGeneratorError):
pass
class BadExitStatusError(MapsGeneratorError):
pass
class ParseError(MapsGeneratorError):
pass
class FailedTest(MapsGeneratorError):
pass
def wait_and_raise_if_fail(p):
if p.wait() != os.EX_OK:
if type(p) is subprocess.Popen:
args = p.args
stdout = p.stdout
stderr = p.stderr
logs = None
errors = None
if type(stdout) is not type(None):
logs = stdout.read(256).decode()
if type(stderr) is not type(None):
errors = stderr.read(256).decode()
if errors != logs:
logs += " and " + errors
msg = f"The launch of {args.pop(0)} failed.\nArguments used: {' '.join(args)}\nSee details in {logs}"
raise BadExitStatusError(msg)
else:
args = p.args
logs = p.output.name
if p.error.name != logs:
logs += " and " + p.error.name
msg = f"The launch of {args.pop(0)} failed.\nArguments used: {' '.join(args)}\nSee details in {logs}"
raise BadExitStatusError(msg)

View file

@ -0,0 +1,162 @@
import copy
import logging
import os
import subprocess
from maps_generator.generator.exceptions import OptionNotFound
from maps_generator.generator.exceptions import ValidationError
from maps_generator.generator.exceptions import wait_and_raise_if_fail
logger = logging.getLogger("maps_generator")
class GenTool:
OPTIONS = {
"dump_cities_boundaries": bool,
"emit_coasts": bool,
"fail_on_coasts": bool,
"generate_cameras": bool,
"generate_cities_boundaries": bool,
"generate_cities_ids": bool,
"generate_features": bool,
"generate_geo_objects_features": bool,
"generate_geo_objects_index": bool,
"generate_geometry": bool,
"generate_index": bool,
"generate_isolines_info": bool,
"generate_maxspeed": bool,
"generate_packed_borders": bool,
"generate_popular_places": bool,
"generate_region_features": bool,
"generate_regions": bool,
"generate_regions_kv": bool,
"generate_search_index": bool,
"generate_traffic_keys": bool,
"generate_world": bool,
"have_borders_for_whole_world": bool,
"make_city_roads": bool,
"make_coasts": bool,
"make_cross_mwm": bool,
"make_routing_index": bool,
"make_transit_cross_mwm": bool,
"make_transit_cross_mwm_experimental": bool,
"preprocess": bool,
"split_by_polygons": bool,
"stats_types": bool,
"version": bool,
"threads_count": int,
"booking_data": str,
"promo_catalog_cities": str,
"brands_data": str,
"brands_translations_data": str,
"cache_path": str,
"cities_boundaries_data": str,
"data_path": str,
"dump_wikipedia_urls": str,
"geo_objects_features": str,
"geo_objects_key_value": str,
"ids_without_addresses": str,
"idToWikidata": str,
"intermediate_data_path": str,
"isolines_path": str,
"addresses_path": str,
"nodes_list_path": str,
"node_storage": str,
"osm_file_name": str,
"osm_file_type": str,
"output": str,
"planet_version": str,
"popular_places_data": str,
"regions_features": str,
"regions_index": str,
"regions_key_value": str,
"srtm_path": str,
"transit_path": str,
"transit_path_experimental": str,
"world_roads_path": str,
"ugc_data": str,
"uk_postcodes_dataset": str,
"us_postcodes_dataset": str,
"user_resource_path": str,
"wikipedia_pages": str,
}
def __init__(
self, name_executable, out=subprocess.DEVNULL, err=subprocess.DEVNULL, **options
):
self.name_executable = name_executable
self.subprocess = None
self.output = out
self.error = err
self.options = {"threads_count": 1}
self.logger = logger
self.add_options(**options)
@property
def args(self):
return self._collect_cmd()
def add_options(self, **options):
if "logger" in options:
self.logger = options["logger"]
for k, v in options.items():
if k == "logger":
continue
if k not in GenTool.OPTIONS:
raise OptionNotFound(f"{k} is unavailable option")
if type(v) is not GenTool.OPTIONS[k]:
raise ValidationError(
f"{k} required {str(GenTool.OPTIONS[k])},"
f" but not {str(type(v))}"
)
self.options[k] = str(v).lower() if type(v) is bool else v
return self
def run_async(self):
assert self.subprocess is None, "You forgot to call wait()"
cmd = self._collect_cmd()
self.subprocess = subprocess.Popen(
cmd, stdout=self.output, stderr=self.error, env=os.environ
)
self.logger.info(
f"Run generator tool [{self.get_build_version()}]:" f" {' '.join(cmd)} "
)
return self
def wait(self):
code = self.subprocess.wait()
self.subprocess = None
return code
def run(self):
self.run_async()
wait_and_raise_if_fail(self)
def branch(self):
c = GenTool(self.name_executable, out=self.output, err=self.error)
c.options = copy.deepcopy(self.options)
return c
def get_build_version(self):
p = subprocess.Popen(
[self.name_executable, "--version"],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
env=os.environ,
)
wait_and_raise_if_fail(p)
out, err = p.communicate()
return out.decode("utf-8").replace("\n", " ").strip()
def _collect_cmd(self):
options = ["".join(["--", k, "=", str(v)]) for k, v in self.options.items()]
return [self.name_executable, *options]
def run_gen_tool(*args, **kwargs):
GenTool(*args, **kwargs).run()

View file

@ -0,0 +1,151 @@
import os
from typing import AnyStr
from typing import List
from typing import Optional
from typing import Type
from typing import Union
import filelock
from maps_generator.generator.env import Env
from maps_generator.generator.exceptions import ContinueError
from maps_generator.generator.stages import Stage
from maps_generator.generator.stages import get_stage_name
from maps_generator.generator.stages import stages
from maps_generator.generator.status import Status
from maps_generator.generator.status import without_stat_ext
class Generation:
"""
Generation describes process of a map generation. It contains stages.
For example:
generation = Generation(env)
generation.add_stage(s1)
generation.add_stage(s2)
generation.run()
"""
def __init__(self, env: Env, build_lock: bool = True):
self.env: Env = env
self.stages: List[Stage] = []
self.runnable_stages: Optional[List[Stage]] = None
self.build_lock: bool = build_lock
for country_stage in stages.countries_stages:
if self.is_skipped_stage(country_stage):
self.env.add_skipped_stage(country_stage)
for stage in stages.stages:
if self.is_skipped_stage(stage):
self.env.add_skipped_stage(stage)
def is_skipped_stage(self, stage: Union[Type[Stage], Stage]) -> bool:
return (
stage.is_production_only and not self.env.production
) or not self.env.is_accepted_stage(stage)
def add_stage(self, stage: Stage):
self.stages.append(stage)
if self.is_skipped_stage(stage):
self.env.add_skipped_stage(stage)
def pre_run(self):
skipped = set()
def traverse(current: Type[Stage]):
deps = stages.dependencies.get(current, [])
for d in deps:
skipped.add(d)
traverse(d)
for skipped_stage in self.env.skipped_stages:
traverse(skipped_stage)
for s in skipped:
self.env.add_skipped_stage(s)
self.runnable_stages = [s for s in self.stages if self.env.is_accepted_stage(s)]
def run(self, from_stage: Optional[AnyStr] = None):
self.pre_run()
if from_stage is not None:
self.reset_to_stage(from_stage)
if self.build_lock:
lock_filename = f"{os.path.join(self.env.paths.build_path, 'lock')}.lock"
with filelock.FileLock(lock_filename, timeout=1):
self.run_stages()
else:
self.run_stages()
def run_stages(self):
for stage in self.runnable_stages:
stage(self.env)
def reset_to_stage(self, stage_name: AnyStr):
"""
Resets generation state to stage_name.
Status files are overwritten new statuses according stage_name.
It supposes that stages have next representation:
stage1, ..., stage_mwm[country_stage_1, ..., country_stage_M], ..., stageN
"""
high_level_stages = [get_stage_name(s) for s in self.runnable_stages]
if not (
stage_name in high_level_stages
or any(stage_name == get_stage_name(s) for s in stages.countries_stages)
):
raise ContinueError(f"{stage_name} not in {', '.join(high_level_stages)}.")
if not os.path.exists(self.env.paths.status_path):
raise ContinueError(f"Status path {self.env.paths.status_path} not found.")
if not os.path.exists(self.env.paths.main_status_path):
raise ContinueError(
f"Status file {self.env.paths.main_status_path} not found."
)
countries_statuses_paths = []
countries = set(self.env.countries)
for f in os.listdir(self.env.paths.status_path):
full_name = os.path.join(self.env.paths.status_path, f)
if (
os.path.isfile(full_name)
and full_name != self.env.paths.main_status_path
and without_stat_ext(f) in countries
):
countries_statuses_paths.append(full_name)
def set_countries_stage(st):
for path in countries_statuses_paths:
Status(path, st).update_status()
def finish_countries_stage():
for path in countries_statuses_paths:
Status(path).finish()
def index(l: List, val):
try:
return l.index(val)
except ValueError:
return -1
mwm_stage_name = get_stage_name(stages.mwm_stage)
stage_mwm_index = index(high_level_stages, mwm_stage_name)
main_status = None
if (
stage_mwm_index == -1
or stage_name in high_level_stages[: stage_mwm_index + 1]
):
main_status = stage_name
set_countries_stage("")
elif stage_name in high_level_stages[stage_mwm_index + 1 :]:
main_status = stage_name
finish_countries_stage()
else:
main_status = get_stage_name(stages.mwm_stage)
set_countries_stage(stage_name)
Status(self.env.paths.main_status_path, main_status).update_status()

View file

@ -0,0 +1,121 @@
import os
import subprocess
from maps_generator.generator import settings
from maps_generator.generator.exceptions import BadExitStatusError
from maps_generator.generator.exceptions import wait_and_raise_if_fail
def build_osmtools(path, output=subprocess.DEVNULL, error=subprocess.DEVNULL):
src = {
settings.OSM_TOOL_UPDATE: "osmupdate.c",
settings.OSM_TOOL_FILTER: "osmfilter.c",
settings.OSM_TOOL_CONVERT: "osmconvert.c",
}
ld_flags = ("-lz",)
cc = []
result = {}
for executable, src in src.items():
out = os.path.join(settings.OSM_TOOLS_PATH, executable)
op = [
settings.OSM_TOOLS_CC,
*settings.OSM_TOOLS_CC_FLAGS,
"-o",
out,
os.path.join(path, src),
*ld_flags,
]
s = subprocess.Popen(op, stdout=output, stderr=error)
cc.append(s)
result[executable] = out
messages = []
for c in cc:
if c.wait() != os.EX_OK:
messages.append(f"The launch of {' '.join(c.args)} failed.")
if messages:
raise BadExitStatusError("\n".join(messages))
return result
def osmconvert(
name_executable,
in_file,
out_file,
output=subprocess.DEVNULL,
error=subprocess.DEVNULL,
run_async=False,
**kwargs,
):
env = os.environ.copy()
env["PATH"] = f"{settings.OSM_TOOLS_PATH}:{env['PATH']}"
p = subprocess.Popen(
[
name_executable,
in_file,
"--drop-author",
"--drop-version",
"--out-o5m",
f"-o={out_file}",
],
env=env,
stdout=output,
stderr=error,
)
if run_async:
return p
else:
wait_and_raise_if_fail(p)
def osmupdate(
name_executable,
in_file,
out_file,
output=subprocess.DEVNULL,
error=subprocess.DEVNULL,
run_async=False,
**kwargs,
):
env = os.environ.copy()
env["PATH"] = f"{settings.OSM_TOOLS_PATH}:{env['PATH']}"
p = subprocess.Popen(
[
name_executable,
"--drop-author",
"--drop-version",
"--out-o5m",
"-v",
in_file,
out_file,
],
env=env,
stdout=output,
stderr=error,
)
if run_async:
return p
else:
wait_and_raise_if_fail(p)
def osmfilter(
name_executable,
in_file,
out_file,
output=subprocess.DEVNULL,
error=subprocess.DEVNULL,
run_async=False,
**kwargs,
):
env = os.environ.copy()
env["PATH"] = f"{settings.OSM_TOOLS_PATH}:{env['PATH']}"
args = [name_executable, in_file, f"-o={out_file}"] + [
f"--{k.replace('_', '-')}={v}" for k, v in kwargs.items()
]
p = subprocess.Popen(args, env=env, stdout=output, stderr=error)
if run_async:
return p
else:
wait_and_raise_if_fail(p)

View file

@ -0,0 +1,333 @@
import argparse
import multiprocessing
import os
import site
import sys
from configparser import ConfigParser
from configparser import ExtendedInterpolation
from pathlib import Path
from typing import Any
from typing import AnyStr
from maps_generator.utils.md5 import md5_ext
from maps_generator.utils.system import total_virtual_memory
ETC_DIR = os.path.join(os.path.dirname(__file__), "..", "var", "etc")
parser = argparse.ArgumentParser(add_help=False)
opt_config = "--config"
parser.add_argument(opt_config, type=str, default="", help="Path to config")
def get_config_path(config_path: AnyStr):
"""
It tries to get an opt_config value.
If doesn't get the value a function returns config_path.
"""
argv = sys.argv
indexes = (-1, -1)
for i, opt in enumerate(argv):
if opt.startswith(f"{opt_config}="):
indexes = (i, i + 1)
if opt == opt_config:
indexes = (i, i + 2)
config_args = argv[indexes[0] : indexes[1]]
if config_args:
return parser.parse_args(config_args).config
config_var = os.environ.get(f"MM_GEN__CONFIG")
return config_path if config_var is None else config_var
class CfgReader:
"""
Config reader.
There are 3 way of getting an option. In priority order:
1. From system env.
2. From config.
3. From default values.
For using the option from system env you can build an option name as
MM__GEN__ + [SECTION_NAME] + _ + [VALUE_NAME].
"""
def __init__(self, default_settings_path: AnyStr):
self.config = ConfigParser(interpolation=ExtendedInterpolation())
self.config.read([get_config_path(default_settings_path)])
def get_opt(self, s: AnyStr, v: AnyStr, default: Any = None):
val = CfgReader._get_env_val(s, v)
if val is not None:
return val
return self.config.get(s, v) if self.config.has_option(s, v) else default
def get_opt_path(self, s: AnyStr, v: AnyStr, default: AnyStr = ""):
return os.path.expanduser(self.get_opt(s, v, default))
@staticmethod
def _get_env_val(s: AnyStr, v: AnyStr):
return os.environ.get(f"MM_GEN__{s.upper()}_{v.upper()}")
DEFAULT_PLANET_URL = "https://planet.openstreetmap.org/pbf/planet-latest.osm.pbf"
# Main section:
# If DEBUG is True, a little special planet is downloaded.
DEBUG = True
_HOME_PATH = str(Path.home())
_WORK_PATH = _HOME_PATH
TMPDIR = os.path.join(_HOME_PATH, "tmp")
MAIN_OUT_PATH = os.path.join(_WORK_PATH, "generation")
CACHE_PATH = ""
# Developer section:
BUILD_PATH = os.path.join(_WORK_PATH, "omim-build-relwithdebinfo")
OMIM_PATH = os.path.join(_WORK_PATH, "omim")
# Osm tools section:
OSM_TOOLS_SRC_PATH = os.path.join(OMIM_PATH, "tools", "osmctools")
OSM_TOOLS_PATH = os.path.join(_WORK_PATH, "osmctools")
# Generator tool section:
USER_RESOURCE_PATH = os.path.join(OMIM_PATH, "data")
NODE_STORAGE = "map"
# Stages section:
NEED_PLANET_UPDATE = False
THREADS_COUNT_FEATURES_STAGE = multiprocessing.cpu_count()
DATA_ARCHIVE_DIR = ""
DIFF_VERSION_DEPTH = 2
# Logging section:
LOG_FILE_PATH = os.path.join(MAIN_OUT_PATH, "generation.log")
# External resources section:
PLANET_URL = DEFAULT_PLANET_URL
PLANET_COASTS_URL = ""
UGC_URL = ""
HOTELS_URL = ""
PROMO_CATALOG_CITIES_URL = ""
PROMO_CATALOG_COUNTRIES_URL = ""
POPULARITY_URL = ""
SUBWAY_URL = ""
TRANSIT_URL = ""
NEED_BUILD_WORLD_ROADS = True
FOOD_URL = ""
FOOD_TRANSLATIONS_URL = ""
UK_POSTCODES_URL = ""
US_POSTCODES_URL = ""
SRTM_PATH = ""
ISOLINES_PATH = ""
ADDRESSES_PATH = ""
# Stats section:
STATS_TYPES_CONFIG = os.path.join(ETC_DIR, "stats_types_config.txt")
# Other variables:
PLANET = "planet"
POSSIBLE_GEN_TOOL_NAMES = ("generator_tool", "omim-generator_tool")
VERSION_FILE_NAME = "version.txt"
# Osm tools:
OSM_TOOL_CONVERT = "osmconvert"
OSM_TOOL_FILTER = "osmfilter"
OSM_TOOL_UPDATE = "osmupdate"
OSM_TOOLS_CC = "cc"
OSM_TOOLS_CC_FLAGS = [
"-O3",
]
# Planet and coasts:
PLANET_COASTS_GEOM_URL = os.path.join(PLANET_COASTS_URL, "latest_coasts.geom")
PLANET_COASTS_RAWGEOM_URL = os.path.join(PLANET_COASTS_URL, "latest_coasts.rawgeom")
# Common:
THREADS_COUNT = multiprocessing.cpu_count()
# for lib logging
LOGGING = {
"version": 1,
"disable_existing_loggers": False,
"formatters": {
"standard": {"format": "[%(asctime)s] %(levelname)s %(module)s %(message)s"},
},
"handlers": {
"stdout": {
"level": "INFO",
"class": "logging.StreamHandler",
"formatter": "standard",
},
"file": {
"level": "DEBUG",
"class": "logging.handlers.WatchedFileHandler",
"formatter": "standard",
"filename": LOG_FILE_PATH,
},
},
"loggers": {
"maps_generator": {
"handlers": ["stdout", "file"],
"level": "DEBUG",
"propagate": True,
}
},
}
def init(default_settings_path: AnyStr):
# Try to read a config and to overload default settings
cfg = CfgReader(default_settings_path)
# Main section:
global DEBUG
global TMPDIR
global MAIN_OUT_PATH
global CACHE_PATH
_DEBUG = cfg.get_opt("Main", "DEBUG")
DEBUG = DEBUG if _DEBUG is None else int(_DEBUG)
TMPDIR = cfg.get_opt_path("Main", "TMPDIR", TMPDIR)
MAIN_OUT_PATH = cfg.get_opt_path("Main", "MAIN_OUT_PATH", MAIN_OUT_PATH)
CACHE_PATH = cfg.get_opt_path("Main", "CACHE_PATH", CACHE_PATH)
# Developer section:
global BUILD_PATH
global OMIM_PATH
BUILD_PATH = cfg.get_opt_path("Developer", "BUILD_PATH", BUILD_PATH)
OMIM_PATH = cfg.get_opt_path("Developer", "OMIM_PATH", OMIM_PATH)
# Osm tools section:
global OSM_TOOLS_SRC_PATH
global OSM_TOOLS_PATH
OSM_TOOLS_SRC_PATH = cfg.get_opt_path(
"Osm tools", "OSM_TOOLS_SRC_PATH", OSM_TOOLS_SRC_PATH
)
OSM_TOOLS_PATH = cfg.get_opt_path("Osm tools", "OSM_TOOLS_PATH", OSM_TOOLS_PATH)
# Generator tool section:
global USER_RESOURCE_PATH
global NODE_STORAGE
USER_RESOURCE_PATH = cfg.get_opt_path(
"Generator tool", "USER_RESOURCE_PATH", USER_RESOURCE_PATH
)
NODE_STORAGE = cfg.get_opt("Generator tool", "NODE_STORAGE", NODE_STORAGE)
assert os.path.exists(OMIM_PATH) is True, f"Can't find OMIM_PATH (set to {OMIM_PATH})"
if not os.path.exists(USER_RESOURCE_PATH):
from data_files import find_data_files
USER_RESOURCE_PATH = find_data_files("omim-data")
assert USER_RESOURCE_PATH is not None
import borders
# Issue: If maps_generator is installed in your system as a system
# package and borders.init() is called first time, call borders.init()
# might return False, because you need root permission.
assert borders.init()
# Stages section:
global NEED_PLANET_UPDATE
global DATA_ARCHIVE_DIR
global DIFF_VERSION_DEPTH
global THREADS_COUNT_FEATURES_STAGE
NEED_PLANET_UPDATE = cfg.get_opt("Stages", "NEED_PLANET_UPDATE", NEED_PLANET_UPDATE)
DATA_ARCHIVE_DIR = cfg.get_opt_path(
"Stages", "DATA_ARCHIVE_DIR", DATA_ARCHIVE_DIR
)
DIFF_VERSION_DEPTH = int(cfg.get_opt(
"Stages", "DIFF_VERSION_DEPTH", DIFF_VERSION_DEPTH
))
threads_count = int(
cfg.get_opt(
"Generator tool",
"THREADS_COUNT_FEATURES_STAGE",
THREADS_COUNT_FEATURES_STAGE,
)
)
if threads_count > 0:
THREADS_COUNT_FEATURES_STAGE = threads_count
# Logging section:
global LOG_FILE_PATH
global LOGGING
LOG_FILE_PATH = os.path.join(MAIN_OUT_PATH, "generation.log")
LOG_FILE_PATH = cfg.get_opt_path("Logging", "MAIN_LOG", LOG_FILE_PATH)
os.makedirs(os.path.dirname(os.path.abspath(LOG_FILE_PATH)), exist_ok=True)
LOGGING["handlers"]["file"]["filename"] = LOG_FILE_PATH
# External section:
global PLANET_URL
global PLANET_MD5_URL
global PLANET_COASTS_URL
global UGC_URL
global HOTELS_URL
global PROMO_CATALOG_CITIES_URL
global PROMO_CATALOG_COUNTRIES_URL
global POPULARITY_URL
global SUBWAY_URL
global TRANSIT_URL
global NEED_BUILD_WORLD_ROADS
global FOOD_URL
global UK_POSTCODES_URL
global US_POSTCODES_URL
global FOOD_TRANSLATIONS_URL
global SRTM_PATH
global ISOLINES_PATH
global ADDRESSES_PATH
PLANET_URL = cfg.get_opt_path("External", "PLANET_URL", PLANET_URL)
PLANET_MD5_URL = cfg.get_opt_path("External", "PLANET_MD5_URL", md5_ext(PLANET_URL))
PLANET_COASTS_URL = cfg.get_opt_path(
"External", "PLANET_COASTS_URL", PLANET_COASTS_URL
)
UGC_URL = cfg.get_opt_path("External", "UGC_URL", UGC_URL)
HOTELS_URL = cfg.get_opt_path("External", "HOTELS_URL", HOTELS_URL)
PROMO_CATALOG_CITIES_URL = cfg.get_opt_path(
"External", "PROMO_CATALOG_CITIES_URL", PROMO_CATALOG_CITIES_URL
)
PROMO_CATALOG_COUNTRIES_URL = cfg.get_opt_path(
"External", "PROMO_CATALOG_COUNTRIES_URL", PROMO_CATALOG_COUNTRIES_URL
)
POPULARITY_URL = cfg.get_opt_path("External", "POPULARITY_URL", POPULARITY_URL)
SUBWAY_URL = cfg.get_opt("External", "SUBWAY_URL", SUBWAY_URL)
TRANSIT_URL = cfg.get_opt("External", "TRANSIT_URL", TRANSIT_URL)
NEED_BUILD_WORLD_ROADS = cfg.get_opt("External", "NEED_BUILD_WORLD_ROADS", NEED_BUILD_WORLD_ROADS)
FOOD_URL = cfg.get_opt("External", "FOOD_URL", FOOD_URL)
UK_POSTCODES_URL = cfg.get_opt("External", "UK_POSTCODES_URL", UK_POSTCODES_URL)
US_POSTCODES_URL = cfg.get_opt("External", "US_POSTCODES_URL", US_POSTCODES_URL)
FOOD_TRANSLATIONS_URL = cfg.get_opt(
"External", "FOOD_TRANSLATIONS_URL", FOOD_TRANSLATIONS_URL
)
SRTM_PATH = cfg.get_opt_path("External", "SRTM_PATH", SRTM_PATH)
ISOLINES_PATH = cfg.get_opt_path("External", "ISOLINES_PATH", ISOLINES_PATH)
ADDRESSES_PATH = cfg.get_opt_path("External", "ADDRESSES_PATH", ADDRESSES_PATH)
# Stats section:
global STATS_TYPES_CONFIG
STATS_TYPES_CONFIG = cfg.get_opt_path(
"Stats", "STATS_TYPES_CONFIG", STATS_TYPES_CONFIG
)
# Common:
global THREADS_COUNT
threads_count = int(cfg.get_opt("Common", "THREADS_COUNT", THREADS_COUNT))
if threads_count > 0:
THREADS_COUNT = threads_count
# Planet and costs:
global PLANET_COASTS_GEOM_URL
global PLANET_COASTS_RAWGEOM_URL
PLANET_COASTS_GEOM_URL = os.path.join(PLANET_COASTS_URL, "latest_coasts.geom")
PLANET_COASTS_RAWGEOM_URL = os.path.join(PLANET_COASTS_URL, "latest_coasts.rawgeom")
if DEBUG:
PLANET_URL = "https://www.dropbox.com/s/m3ru5tnj8g9u4cz/planet-latest.o5m?raw=1"
PLANET_MD5_URL = (
"https://www.dropbox.com/s/8wdl2hy22jgisk5/planet-latest.o5m.md5?raw=1"
)
NEED_PLANET_UPDATE = False

View file

@ -0,0 +1,380 @@
""""
This file contains some decorators that define stages.
There are two main types of stages:
1. outer_stage - a high level stage
2. country_stage - a stage that applies to countries files(*.mwm).
country_stage might be inside stage. There are country stages inside mwm_stage.
mwm_stage is only one stage that contains country_stages.
"""
import datetime
import logging
import os
import time
from abc import ABC
from abc import abstractmethod
from collections import defaultdict
from typing import AnyStr
from typing import Callable
from typing import Dict
from typing import List
from typing import Optional
from typing import Type
from typing import Union
import filelock
from maps_generator.generator import status
from maps_generator.generator.exceptions import FailedTest
from maps_generator.utils.file import download_files
from maps_generator.utils.file import normalize_url_to_path_dict
from maps_generator.utils.log import DummyObject
from maps_generator.utils.log import create_file_handler
from maps_generator.utils.log import create_file_logger
logger = logging.getLogger("maps_generator")
class InternalDependency:
def __init__(self, url, path_method, mode=""):
self.url = url
self.path_method = path_method
self.mode = mode
class Test:
def __init__(self, test, need_run=None, is_pretest=False):
self._test = test
self._need_run = need_run
self.is_pretest = is_pretest
@property
def name(self):
return self._test.__name__
def need_run(self, env, _logger):
if self._need_run is None:
return True
if callable(self._need_run):
return self._need_run(env, _logger)
return self._need_run
def test(self, env, _logger, *args, **kwargs):
try:
res = self._test(env, _logger, *args, **kwargs)
except Exception as e:
raise FailedTest(f"Test {self.name} is failed.") from e
if not res:
raise FailedTest(f"Test {self.name} is failed.")
_logger.info(f"Test {self.name} is successfully completed.")
class Stage(ABC):
need_planet_lock = False
need_build_lock = False
is_helper = False
is_mwm_stage = False
is_production_only = False
def __init__(self, **args):
self.args = args
def __call__(self, env: "Env"):
return self.apply(env, **self.args)
@abstractmethod
def apply(self, *args, **kwargs):
pass
def get_stage_name(stage: Union[Type[Stage], Stage]) -> AnyStr:
n = stage.__class__.__name__ if isinstance(stage, Stage) else stage.__name__
return n.replace("Stage", "")
def get_stage_type(stage: Union[Type[Stage], AnyStr]):
from . import stages_declaration as sd
if isinstance(stage, str):
if not stage.startswith("Stage"):
stage = f"Stage{stage}"
return getattr(sd, stage)
return stage
class Stages:
"""Stages class is used for storing all stages."""
def __init__(self):
self.mwm_stage: Optional[Type[Stage]] = None
self.countries_stages: List[Type[Stage]] = []
self.stages: List[Type[Stage]] = []
self.helper_stages: List[Type[Stage]] = []
self.dependencies = defaultdict(set)
def init(self):
# We normalize self.dependencies to Dict[Type[Stage], Set[Type[Stage]]].
dependencies = defaultdict(set)
for k, v in self.dependencies.items():
dependencies[get_stage_type(k)] = set(get_stage_type(x) for x in v)
self.dependencies = dependencies
def set_mwm_stage(self, stage: Type[Stage]):
assert self.mwm_stage is None
self.mwm_stage = stage
def add_helper_stage(self, stage: Type[Stage]):
self.helper_stages.append(stage)
def add_country_stage(self, stage: Type[Stage]):
self.countries_stages.append(stage)
def add_stage(self, stage: Type[Stage]):
self.stages.append(stage)
def add_dependency_for(self, stage: Type[Stage], *deps):
for dep in deps:
self.dependencies[stage].add(dep)
def get_invisible_stages_names(self) -> List[AnyStr]:
return [get_stage_name(st) for st in self.helper_stages]
def get_visible_stages_names(self) -> List[AnyStr]:
"""Returns all stages names except helper stages names."""
stages = []
for s in self.stages:
stages.append(get_stage_name(s))
if s == self.mwm_stage:
stages += [get_stage_name(st) for st in self.countries_stages]
return stages
def is_valid_stage_name(self, stage_name) -> bool:
return get_stage_name(self.mwm_stage) == stage_name or any(
any(stage_name == get_stage_name(x) for x in c)
for c in [self.countries_stages, self.stages, self.helper_stages]
)
# A global variable stage contains all possible stages.
stages = Stages()
def outer_stage(stage: Type[Stage]) -> Type[Stage]:
"""It's decorator that defines high level stage."""
if stage.is_helper:
stages.add_helper_stage(stage)
else:
stages.add_stage(stage)
if stage.is_mwm_stage:
stages.set_mwm_stage(stage)
def new_apply(method):
def apply(obj: Stage, env: "Env", *args, **kwargs):
name = get_stage_name(obj)
logfile = os.path.join(env.paths.log_path, f"{name}.log")
log_handler = create_file_handler(logfile)
logger.addHandler(log_handler)
# This message is used as an anchor for parsing logs.
# See maps_generator/checks/logs/logs_reader.py STAGE_START_MSG_PATTERN
logger.info(f"Stage {name}: start ...")
t = time.time()
try:
if not env.is_accepted_stage(stage):
logger.info(f"Stage {name} was not accepted.")
return
main_status = env.main_status
main_status.init(env.paths.main_status_path, name)
if main_status.need_skip():
logger.warning(f"Stage {name} was skipped.")
return
main_status.update_status()
env.set_subprocess_out(log_handler.stream)
method(obj, env, *args, **kwargs)
finally:
d = time.time() - t
# This message is used as an anchor for parsing logs.
# See maps_generator/checks/logs/logs_reader.py STAGE_FINISH_MSG_PATTERN
logger.info(
f"Stage {name}: finished in {str(datetime.timedelta(seconds=d))}"
)
logger.removeHandler(log_handler)
return apply
stage.apply = new_apply(stage.apply)
return stage
def country_stage_status(stage: Type[Stage]) -> Type[Stage]:
"""It's helper decorator that works with status file."""
def new_apply(method):
def apply(obj: Stage, env: "Env", country: AnyStr, *args, **kwargs):
name = get_stage_name(obj)
_logger = DummyObject()
countries_meta = env.countries_meta
if "logger" in countries_meta[country]:
_logger, _ = countries_meta[country]["logger"]
if not env.is_accepted_stage(stage):
_logger.info(f"Stage {name} was not accepted.")
return
if "status" not in countries_meta[country]:
countries_meta[country]["status"] = status.Status()
country_status = countries_meta[country]["status"]
status_file = os.path.join(
env.paths.status_path, status.with_stat_ext(country)
)
country_status.init(status_file, name)
if country_status.need_skip():
_logger.warning(f"Stage {name} was skipped.")
return
country_status.update_status()
method(obj, env, country, *args, **kwargs)
return apply
stage.apply = new_apply(stage.apply)
return stage
def country_stage_log(stage: Type[Stage]) -> Type[Stage]:
"""It's helper decorator that works with log file."""
def new_apply(method):
def apply(obj: Stage, env: "Env", country: AnyStr, *args, **kwargs):
name = get_stage_name(obj)
log_file = os.path.join(env.paths.log_path, f"{country}.log")
countries_meta = env.countries_meta
if "logger" not in countries_meta[country]:
countries_meta[country]["logger"] = create_file_logger(log_file)
_logger, log_handler = countries_meta[country]["logger"]
# This message is used as an anchor for parsing logs.
# See maps_generator/checks/logs/logs_reader.py STAGE_START_MSG_PATTERN
_logger.info(f"Stage {name}: start ...")
t = time.time()
env.set_subprocess_out(log_handler.stream, country)
method(obj, env, country, *args, logger=_logger, **kwargs)
d = time.time() - t
# This message is used as an anchor for parsing logs.
# See maps_generator/checks/logs/logs_reader.py STAGE_FINISH_MSG_PATTERN
_logger.info(
f"Stage {name}: finished in {str(datetime.timedelta(seconds=d))}"
)
return apply
stage.apply = new_apply(stage.apply)
return stage
def test_stage(*tests: Test) -> Callable[[Type[Stage],], Type[Stage]]:
def new_apply(method):
def apply(obj: Stage, env: "Env", *args, **kwargs):
_logger = kwargs["logger"] if "logger" in kwargs else logger
def run_tests(tests):
for test in tests:
if test.need_run(env, _logger):
test.test(env, _logger, *args, **kwargs)
else:
_logger.info(f"Test {test.name} was skipped.")
run_tests(filter(lambda t: t.is_pretest, tests))
method(obj, env, *args, **kwargs)
run_tests(filter(lambda t: not t.is_pretest, tests))
return apply
def wrapper(stage: Type[Stage]) -> Type[Stage]:
stage.apply = new_apply(stage.apply)
return stage
return wrapper
def country_stage(stage: Type[Stage]) -> Type[Stage]:
"""It's decorator that defines country stage."""
if stage.is_helper:
stages.add_helper_stage(stage)
else:
stages.add_country_stage(stage)
return country_stage_log(country_stage_status(stage))
def mwm_stage(stage: Type[Stage]) -> Type[Stage]:
stage.is_mwm_stage = True
return stage
def production_only(stage: Type[Stage]) -> Type[Stage]:
stage.is_production_only = True
return stage
def helper_stage_for(*deps) -> Callable[[Type[Stage],], Type[Stage]]:
def wrapper(stage: Type[Stage]) -> Type[Stage]:
stages.add_dependency_for(stage, *deps)
stage.is_helper = True
return stage
return wrapper
def depends_from_internal(*deps) -> Callable[[Type[Stage],], Type[Stage]]:
def get_urls(
env: "Env", internal_dependencies: List[InternalDependency]
) -> Dict[AnyStr, AnyStr]:
deps = {}
for d in internal_dependencies:
if "p" in d.mode and not env.production or not d.url:
continue
path = None
if type(d.path_method) is property:
path = d.path_method.__get__(env.paths)
assert path is not None, type(d.path_method)
deps[d.url] = path
return deps
def download_under_lock(env: "Env", urls: Dict[AnyStr, AnyStr], stage_name: AnyStr):
lock_name = f"{os.path.join(env.paths.status_path, stage_name)}.lock"
status_name = f"{os.path.join(env.paths.status_path, stage_name)}.download"
with filelock.FileLock(lock_name):
s = status.Status(status_name)
if not s.is_finished():
urls = normalize_url_to_path_dict(urls)
download_files(urls, env.force_download_files)
s.finish()
def new_apply(method):
def apply(obj: Stage, env: "Env", *args, **kwargs):
if hasattr(obj, "internal_dependencies") and obj.internal_dependencies:
urls = get_urls(env, obj.internal_dependencies)
if urls:
download_under_lock(env, urls, get_stage_name(obj))
method(obj, env, *args, **kwargs)
return apply
def wrapper(stage: Type[Stage]) -> Type[Stage]:
stage.internal_dependencies = deps
stage.apply = new_apply(stage.apply)
return stage
return wrapper

View file

@ -0,0 +1,446 @@
""""
This file contains possible stages that maps_generator can run.
Some algorithms suppose a maps genration processes looks like:
stage1, ..., stage_mwm[country_stage_1, ..., country_stage_M], ..., stageN
Only stage_mwm can contain country_
"""
import datetime
import json
import logging
import multiprocessing
import os
import shutil
import tarfile
import errno
from collections import defaultdict
from concurrent.futures import ThreadPoolExecutor, as_completed
from typing import AnyStr
from typing import Type
import maps_generator.generator.diffs as diffs
import maps_generator.generator.stages_tests as st
# from descriptions.descriptions_downloader import check_and_get_checker
# from descriptions.descriptions_downloader import download_from_wikidata_tags
# from descriptions.descriptions_downloader import download_from_wikipedia_tags
from maps_generator.generator import coastline
from maps_generator.generator import settings
from maps_generator.generator import steps
from maps_generator.generator.env import Env
from maps_generator.generator.env import PathProvider
from maps_generator.generator.env import WORLD_COASTS_NAME
from maps_generator.generator.env import WORLD_NAME
from maps_generator.generator.exceptions import BadExitStatusError
from maps_generator.generator.gen_tool import run_gen_tool
from maps_generator.generator.stages import InternalDependency as D
from maps_generator.generator.stages import Stage
from maps_generator.generator.stages import Test
from maps_generator.generator.stages import country_stage
from maps_generator.generator.stages import depends_from_internal
from maps_generator.generator.stages import helper_stage_for
from maps_generator.generator.stages import mwm_stage
from maps_generator.generator.stages import outer_stage
from maps_generator.generator.stages import production_only
from maps_generator.generator.stages import test_stage
from maps_generator.generator.statistics import get_stages_info
from maps_generator.utils.file import download_files
from maps_generator.utils.file import is_verified
from post_generation.hierarchy_to_countries import hierarchy_to_countries
from post_generation.inject_promo_ids import inject_promo_ids
logger = logging.getLogger("maps_generator")
def is_accepted(env: Env, stage: Type[Stage]) -> bool:
return env.is_accepted_stage(stage)
@outer_stage
class StageDownloadAndConvertPlanet(Stage):
def apply(self, env: Env, force_download: bool = True, **kwargs):
if force_download or not is_verified(env.paths.planet_o5m):
steps.step_download_and_convert_planet(
env, force_download=force_download, **kwargs
)
@outer_stage
class StageUpdatePlanet(Stage):
def apply(self, env: Env, **kwargs):
steps.step_update_planet(env, **kwargs)
@outer_stage
class StageCoastline(Stage):
def apply(self, env: Env, use_old_if_fail=True):
coasts_geom = "WorldCoasts.geom"
coasts_rawgeom = "WorldCoasts.rawgeom"
try:
coastline.make_coastline(env)
except BadExitStatusError as e:
if not use_old_if_fail:
raise e
logger.warning("Build coasts failed. Try to download the coasts...")
download_files(
{
settings.PLANET_COASTS_GEOM_URL: os.path.join(
env.paths.coastline_path, coasts_geom
),
settings.PLANET_COASTS_RAWGEOM_URL: os.path.join(
env.paths.coastline_path, coasts_rawgeom
),
}
)
for f in [coasts_geom, coasts_rawgeom]:
path = os.path.join(env.paths.coastline_path, f)
shutil.copy2(path, env.paths.intermediate_data_path)
@outer_stage
class StagePreprocess(Stage):
def apply(self, env: Env, **kwargs):
steps.step_preprocess(env, **kwargs)
@outer_stage
@depends_from_internal(
D(settings.HOTELS_URL, PathProvider.hotels_path, "p"),
D(settings.PROMO_CATALOG_CITIES_URL, PathProvider.promo_catalog_cities_path, "p"),
D(settings.POPULARITY_URL, PathProvider.popularity_path, "p"),
D(settings.FOOD_URL, PathProvider.food_paths, "p"),
D(settings.FOOD_TRANSLATIONS_URL, PathProvider.food_translations_path, "p"),
)
@test_stage(
Test(st.make_test_booking_data(max_days=7), lambda e, _: e.production, True)
)
class StageFeatures(Stage):
def apply(self, env: Env):
extra = {}
if is_accepted(env, StageDescriptions):
extra.update({"idToWikidata": env.paths.id_to_wikidata_path})
if env.production:
extra.update(
{
"booking_data": env.paths.hotels_path,
"promo_catalog_cities": env.paths.promo_catalog_cities_path,
"popular_places_data": env.paths.popularity_path,
"brands_data": env.paths.food_paths,
"brands_translations_data": env.paths.food_translations_path,
}
)
if is_accepted(env, StageCoastline):
extra.update({"emit_coasts": True})
if is_accepted(env, StageIsolinesInfo):
extra.update({"isolines_path": PathProvider.isolines_path()})
extra.update({"addresses_path": PathProvider.addresses_path()})
steps.step_features(env, **extra)
if os.path.exists(env.paths.packed_polygons_path):
shutil.copy2(env.paths.packed_polygons_path, env.paths.mwm_path)
@outer_stage
@helper_stage_for("StageDescriptions")
class StageDownloadDescriptions(Stage):
def apply(self, env: Env):
"""
run_gen_tool(
env.gen_tool,
out=env.get_subprocess_out(),
err=env.get_subprocess_out(),
data_path=env.paths.data_path,
intermediate_data_path=env.paths.intermediate_data_path,
cache_path=env.paths.cache_path,
user_resource_path=env.paths.user_resource_path,
dump_wikipedia_urls=env.paths.wiki_url_path,
idToWikidata=env.paths.id_to_wikidata_path,
threads_count=settings.THREADS_COUNT,
)
# https://en.wikipedia.org/wiki/Wikipedia:Multilingual_statistics
langs = ("en", "de", "fr", "es", "ru", "tr")
checker = check_and_get_checker(env.paths.popularity_path)
download_from_wikipedia_tags(
env.paths.wiki_url_path, env.paths.descriptions_path, langs, checker
)
download_from_wikidata_tags(
env.paths.id_to_wikidata_path, env.paths.descriptions_path, langs, checker
)
"""
# The src folder is hardcoded here and must be implemented on the map building machine
src = "/home/planet/wikipedia/descriptions"
# The dest folder will generally become build/*/intermediate_data/descriptions
dest = env.paths.descriptions_path
# An empty source folder is a big problem
try:
if os.path.isdir(src):
print("Found %s" % (src))
else:
raise FileNotFoundError(errno.ENOENT, os.strerror(errno.ENOENT), src)
except OSError as e:
print("rmtree error: %s - %s" % (e.filename, e.strerror))
# Empty folder "descriptions" can be already created.
try:
if os.path.isdir(dest):
shutil.rmtree(dest)
else:
os.remove(dest)
except OSError as e:
print("rmtree error: %s - %s" % (e.filename, e.strerror))
os.symlink(src, dest)
@outer_stage
@mwm_stage
class StageMwm(Stage):
def apply(self, env: Env):
tmp_mwm_names = env.get_tmp_mwm_names()
if len(tmp_mwm_names):
logger.info(f'Number of feature data .mwm.tmp country files to process: {len(tmp_mwm_names)}')
with ThreadPoolExecutor(settings.THREADS_COUNT) as pool:
pool.map(
lambda c: StageMwm.make_mwm(c, env),
tmp_mwm_names
)
else:
# TODO: list all countries that were not found?
logger.warning(f'There are no feature data .mwm.tmp country files to process in {env.paths.intermediate_tmp_path}!')
logger.warning('Countries requested for generation are not in the supplied planet file?')
@staticmethod
def make_mwm(country: AnyStr, env: Env):
logger.info(f'Starting mwm generation for {country}')
world_stages = {
WORLD_NAME: [
StageIndex,
StageCitiesIdsWorld,
StagePopularityWorld,
StagePrepareRoutingWorld,
StageRoutingWorld,
StageMwmStatistics,
],
WORLD_COASTS_NAME: [StageIndex, StageMwmStatistics],
}
mwm_stages = [
StageIndex,
StageUgc,
StageSrtm,
StageIsolinesInfo,
StageDescriptions,
# call after descriptions
StagePopularity,
StageRouting,
StageRoutingTransit,
StageMwmDiffs,
StageMwmStatistics,
]
for stage in world_stages.get(country, mwm_stages):
logger.info(f'{country} mwm stage {stage.__name__}: start...')
stage(country=country)(env)
env.finish_mwm(country)
logger.info(f'Finished mwm generation for {country}')
@country_stage
class StageIndex(Stage):
def apply(self, env: Env, country, **kwargs):
if country == WORLD_NAME:
steps.step_index_world(env, country, **kwargs)
elif country == WORLD_COASTS_NAME:
steps.step_coastline_index(env, country, **kwargs)
else:
kwargs.update(
{
"uk_postcodes_dataset": settings.UK_POSTCODES_URL,
"us_postcodes_dataset": settings.US_POSTCODES_URL,
}
)
steps.step_index(env, country, **kwargs)
@country_stage
@production_only
class StageCitiesIdsWorld(Stage):
def apply(self, env: Env, country, **kwargs):
steps.step_cities_ids_world(env, country, **kwargs)
@country_stage
@helper_stage_for("StageRoutingWorld")
# ToDo: Are we sure that this stage will be skipped if StageRoutingWorld is skipped?
class StagePrepareRoutingWorld(Stage):
def apply(self, env: Env, country, **kwargs):
steps.step_prepare_routing_world(env, country, **kwargs)
@country_stage
class StageRoutingWorld(Stage):
def apply(self, env: Env, country, **kwargs):
steps.step_routing_world(env, country, **kwargs)
@country_stage
@depends_from_internal(D(settings.UGC_URL, PathProvider.ugc_path),)
@production_only
class StageUgc(Stage):
def apply(self, env: Env, country, **kwargs):
steps.step_ugc(env, country, **kwargs)
@country_stage
class StagePopularity(Stage):
def apply(self, env: Env, country, **kwargs):
steps.step_popularity(env, country, **kwargs)
@country_stage
class StagePopularityWorld(Stage):
def apply(self, env: Env, country, **kwargs):
steps.step_popularity_world(env, country, **kwargs)
@country_stage
class StageSrtm(Stage):
def apply(self, env: Env, country, **kwargs):
steps.step_srtm(env, country, **kwargs)
@country_stage
class StageIsolinesInfo(Stage):
def apply(self, env: Env, country, **kwargs):
steps.step_isolines_info(env, country, **kwargs)
@country_stage
class StageDescriptions(Stage):
def apply(self, env: Env, country, **kwargs):
steps.step_description(env, country, **kwargs)
@country_stage
class StageRouting(Stage):
def apply(self, env: Env, country, **kwargs):
steps.step_routing(env, country, **kwargs)
@country_stage
@depends_from_internal(
D(settings.SUBWAY_URL, PathProvider.subway_path),
D(settings.TRANSIT_URL, PathProvider.transit_path_experimental),
)
class StageRoutingTransit(Stage):
def apply(self, env: Env, country, **kwargs):
steps.step_routing_transit(env, country, **kwargs)
@country_stage
class StageMwmDiffs(Stage):
def apply(self, env: Env, country, logger, **kwargs):
data_dir = diffs.DataDir(
diff_tool = env.diff_tool,
mwm_name = f"{country}.mwm",
new_version_dir = env.paths.mwm_path,
old_version_root_dir = settings.DATA_ARCHIVE_DIR,
)
diffs.mwm_diff_calculation(data_dir, logger, depth=settings.DIFF_VERSION_DEPTH)
@country_stage
@helper_stage_for("StageStatistics")
class StageMwmStatistics(Stage):
def apply(self, env: Env, country, **kwargs):
steps.step_statistics(env, country, **kwargs)
@outer_stage
@depends_from_internal(
D(
settings.PROMO_CATALOG_COUNTRIES_URL,
PathProvider.promo_catalog_countries_path,
"p",
),
D(settings.PROMO_CATALOG_CITIES_URL, PathProvider.promo_catalog_cities_path, "p"),
)
class StageCountriesTxt(Stage):
def apply(self, env: Env):
countries = hierarchy_to_countries(
env.paths.old_to_new_path,
env.paths.borders_to_osm_path,
env.paths.countries_synonyms_path,
env.paths.hierarchy_path,
env.paths.mwm_path,
env.paths.mwm_version,
)
if env.production:
inject_promo_ids(
countries,
env.paths.promo_catalog_cities_path,
env.paths.promo_catalog_countries_path,
env.paths.mwm_path,
env.paths.types_path,
env.paths.mwm_path,
)
with open(env.paths.counties_txt_path, "w") as f:
json.dump(countries, f, ensure_ascii=False, indent=1)
@outer_stage
@production_only
class StageLocalAds(Stage):
def apply(self, env: Env):
create_csv(
env.paths.localads_path,
env.paths.mwm_path,
env.paths.mwm_path,
env.mwm_version,
multiprocessing.cpu_count(),
)
with tarfile.open(f"{env.paths.localads_path}.tar.gz", "w:gz") as tar:
for filename in os.listdir(env.paths.localads_path):
tar.add(os.path.join(env.paths.localads_path, filename), arcname=filename)
@outer_stage
class StageStatistics(Stage):
def apply(self, env: Env):
steps_info = get_stages_info(env.paths.log_path, {"statistics"})
stats = defaultdict(lambda: defaultdict(dict))
stats["steps"] = steps_info["steps"]
for country in env.get_tmp_mwm_names():
with open(os.path.join(env.paths.stats_path, f"{country}.json")) as f:
stats["countries"][country] = {
"types": json.load(f),
"steps": steps_info["countries"][country],
}
def default(o):
if isinstance(o, datetime.timedelta):
return str(o)
with open(os.path.join(env.paths.stats_path, "stats.json"), "w") as f:
json.dump(
stats, f, ensure_ascii=False, sort_keys=True, indent=2, default=default
)
@outer_stage
class StageCleanup(Stage):
def apply(self, env: Env):
logger.info(
f"osm2ft files will be moved from {env.paths.mwm_path} "
f"to {env.paths.osm2ft_path}."
)
for x in os.listdir(env.paths.mwm_path):
p = os.path.join(env.paths.mwm_path, x)
if os.path.isfile(p) and x.endswith(".mwm.osm2ft"):
shutil.move(p, os.path.join(env.paths.osm2ft_path, x))
logger.info(f"{env.paths.draft_path} will be removed.")
shutil.rmtree(env.paths.draft_path)

View file

@ -0,0 +1,27 @@
import os
from datetime import datetime
import json
from maps_generator.generator import settings
from maps_generator.generator.env import Env
from maps_generator.utils.file import download_file
def make_test_booking_data(max_days):
def test_booking_data(env: Env, logger, *args, **kwargs):
if not settings.HOTELS_URL:
return None
base_url, _ = settings.HOTELS_URL.rsplit("/", maxsplit=1)
url = f"{base_url}/meta.json"
meta_path = os.path.join(env.paths.tmp_dir(), "hotels-meta.json")
download_file(url, meta_path)
with open(meta_path) as f:
meta = json.load(f)
raw_date = meta["latest"].strip()
logger.info(f"Booking date is from {raw_date}.")
dt = datetime.strptime(raw_date, "%Y_%m_%d-%H_%M_%S")
return (env.dt - dt).days < max_days
return test_booking_data

View file

@ -0,0 +1,185 @@
import datetime
import json
import logging
import os
import re
from collections import defaultdict
from typing import AnyStr
from typing import Dict
from typing import List
from maps_generator.generator.env import WORLDS_NAMES
from maps_generator.generator.exceptions import ParseError
logger = logging.getLogger("maps_generator")
# Parse entries, written by ./generator/statistics.cpp PrintTypeStats.
RE_STAT = re.compile(
r"([\w:-]+): "
r"size = +\d+; "
r"features = +(\d+); "
r"length = +([0-9.e+-]+) m; "
r"area = +([0-9.e+-]+) m²; "
r"w\/names = +(\d+)"
)
RE_TIME_DELTA = re.compile(
r"^(?:(?P<days>-?\d+) (days?, )?)?"
r"((?:(?P<hours>-?\d+):)(?=\d+:\d+))?"
r"(?:(?P<minutes>-?\d+):)?"
r"(?P<seconds>-?\d+)"
r"(?:\.(?P<microseconds>\d{1,6})\d{0,6})?$"
)
RE_FINISH_STAGE = re.compile(r"(.*)Stage (.+): finished in (.+)$")
def read_stat(f):
stats = []
for line in f:
m = RE_STAT.match(line)
# Skip explanation header strings.
if m is None:
continue
stats.append(
{
"name": m.group(1),
"cnt": int(m.group(2)),
"len": float(m.group(3)),
"area": float(m.group(4)),
"names": int(m.group(5)),
}
)
return stats
def read_config(f):
config = []
for line in f:
l = line.strip()
if l.startswith("#") or not l:
continue
columns = [c.strip() for c in l.split(";", 2)]
columns[0] = re.compile(columns[0])
columns[1] = columns[1].lower()
config.append(columns)
return config
def process_stat(config, stats):
result = {}
for param in config:
res = 0
for t in stats:
if param[0].match(t["name"]):
if param[1] == "len":
res += t["len"]
elif param[1] == "area":
res += t["area"]
elif param[1] == "cnt_names":
res += t["names"]
else:
res += t["cnt"]
result[str(param[0]) + param[1]] = res
return result
def format_res(res, t):
unit = None
if t == "len":
unit = "m"
elif t == "area":
unit = ""
elif t == "cnt" or t == "cnt_names":
unit = "pc"
else:
raise ParseError(f"Unknown type {t}.")
return res, unit
def make_stats(config_path, stats_path):
with open(config_path) as f:
config = read_config(f)
with open(stats_path) as f:
stats = process_stat(config, read_stat(f))
lines = []
for param in config:
k = str(param[0]) + param[1]
st = format_res(stats[k], param[1])
lines.append({"type": param[2], "quantity": st[0], "unit": st[1]})
return lines
def parse_time(time_str):
parts = RE_TIME_DELTA.match(time_str)
if not parts:
return
parts = parts.groupdict()
time_params = {}
for name, param in parts.items():
if param:
time_params[name] = int(param)
return datetime.timedelta(**time_params)
def get_stages_info(log_path, ignored_stages=frozenset()):
result = defaultdict(lambda: defaultdict(dict))
for file in os.listdir(log_path):
path = os.path.join(log_path, file)
with open(path) as f:
for line in f:
m = RE_FINISH_STAGE.match(line)
if not m:
continue
stage_name = m.group(2)
dt = parse_time(m.group(3))
if file.startswith("stage_") and stage_name not in ignored_stages:
result["stages"][stage_name] = dt
else:
country = file.split(".")[0]
result["countries"][country][stage_name] = dt
return result
def read_types(path: AnyStr) -> Dict[AnyStr, Dict]:
""""
Reads and summarizes statistics for all countries, excluding World and
WorldCoast.
"""
with open(path) as f:
json_data = json.load(f)
all_types = {}
countries = json_data["countries"]
for country, json_value in countries.items():
if country in WORLDS_NAMES:
continue
try:
json_types = json_value["types"]
except KeyError:
logger.exception(f"Cannot parse {json_value}")
continue
for t in json_types:
curr = all_types.get(t["type"], {})
curr["quantity"] = curr.get("quantity", 0.0) + t["quantity"]
curr["unit"] = t["unit"]
all_types[t["type"]] = curr
return all_types
def diff(new: Dict[AnyStr, Dict], old: Dict[AnyStr, Dict]) -> List:
assert len(new) == len(old)
lines = []
for key in new:
o = old[key]["quantity"]
n = new[key]["quantity"]
rel = 0
if o != 0.0:
rel = int(((n - o) / o) * 100)
else:
if n != 0.0:
rel = 100
lines.append((key, o, n, rel, n - o, new[key]["unit"],))
return lines

View file

@ -0,0 +1,53 @@
import os
from typing import AnyStr
from typing import Optional
def with_stat_ext(country: AnyStr):
return f"{country}.status"
def without_stat_ext(status: AnyStr):
return status.replace(".status", "")
class Status:
"""Status is used for recovering and continuation maps generation."""
def __init__(
self, stat_path: Optional[AnyStr] = None, stat_next: Optional[AnyStr] = None
):
self.stat_path = stat_path
self.stat_next = stat_next
self.stat_saved = None
self.find = False
def init(self, stat_path: AnyStr, stat_next: AnyStr):
self.stat_path = stat_path
self.stat_next = stat_next
self.stat_saved = self.status()
if not self.find:
self.find = not self.stat_saved or not self.need_skip()
def need_skip(self) -> bool:
if self.find:
return False
return self.stat_saved and self.stat_next != self.stat_saved
def update_status(self):
with open(self.stat_path, "w") as status:
status.write(self.stat_next)
def finish(self):
with open(self.stat_path, "w") as status:
status.write("finish")
def is_finished(self):
return self.status() == "finish"
def status(self):
try:
with open(self.stat_path) as status:
return status.read()
except IOError:
return None

View file

@ -0,0 +1,453 @@
"""
This file contains basic api for generator_tool and osm tools to generate maps.
"""
import functools
import json
import logging
import os
import shutil
import subprocess
from typing import AnyStr
from maps_generator.generator import settings
from maps_generator.generator.env import Env
from maps_generator.generator.env import PathProvider
from maps_generator.generator.env import WORLDS_NAMES
from maps_generator.generator.env import WORLD_NAME
from maps_generator.generator.env import get_all_countries_list
from maps_generator.generator.exceptions import ValidationError
from maps_generator.generator.exceptions import wait_and_raise_if_fail
from maps_generator.generator.gen_tool import run_gen_tool
from maps_generator.generator.osmtools import osmconvert
from maps_generator.generator.osmtools import osmfilter
from maps_generator.generator.osmtools import osmupdate
from maps_generator.generator.statistics import make_stats
from maps_generator.utils.file import download_files
from maps_generator.utils.file import is_verified
from maps_generator.utils.file import make_symlink
from maps_generator.utils.md5 import md5_ext
from maps_generator.utils.md5 import write_md5sum
logger = logging.getLogger("maps_generator")
def multithread_run_if_one_country(func):
@functools.wraps(func)
def wrap(env, country, **kwargs):
if len(env.countries) == 1:
kwargs.update({"threads_count": settings.THREADS_COUNT})
# Otherwise index stage of Taiwan_* mwms continues to run after all other mwms have finished:
elif country == 'Taiwan_North':
kwargs.update({"threads_count": 6})
elif country == 'Taiwan_South':
kwargs.update({"threads_count": 2})
func(env, country, **kwargs)
return wrap
def convert_planet(
tool: AnyStr,
in_planet: AnyStr,
out_planet: AnyStr,
output=subprocess.DEVNULL,
error=subprocess.DEVNULL,
):
osmconvert(tool, in_planet, out_planet, output=output, error=error)
write_md5sum(out_planet, md5_ext(out_planet))
def step_download_and_convert_planet(env: Env, force_download: bool, **kwargs):
# Do not copy, convert, check a local .o5m planet dump, just symlink it instead.
src = settings.PLANET_URL
if src.startswith("file://") and src.endswith(".o5m"):
os.symlink(src[7:], env.paths.planet_o5m)
return
if force_download or not is_verified(env.paths.planet_osm_pbf):
download_files(
{
settings.PLANET_URL: env.paths.planet_osm_pbf,
settings.PLANET_MD5_URL: md5_ext(env.paths.planet_osm_pbf),
},
env.force_download_files,
)
if not is_verified(env.paths.planet_osm_pbf):
raise ValidationError(f"Wrong md5 sum for {env.paths.planet_osm_pbf}.")
convert_planet(
env[settings.OSM_TOOL_CONVERT],
env.paths.planet_osm_pbf,
env.paths.planet_o5m,
output=env.get_subprocess_out(),
error=env.get_subprocess_out(),
)
os.remove(env.paths.planet_osm_pbf)
os.remove(md5_ext(env.paths.planet_osm_pbf))
def step_update_planet(env: Env, **kwargs):
tmp = f"{env.paths.planet_o5m}.tmp"
osmupdate(
env[settings.OSM_TOOL_UPDATE],
env.paths.planet_o5m,
tmp,
output=env.get_subprocess_out(),
error=env.get_subprocess_out(),
**kwargs,
)
os.remove(env.paths.planet_o5m)
os.rename(tmp, env.paths.planet_o5m)
write_md5sum(env.paths.planet_o5m, md5_ext(env.paths.planet_o5m))
def step_preprocess(env: Env, **kwargs):
run_gen_tool(
env.gen_tool,
out=env.get_subprocess_out(),
err=env.get_subprocess_out(),
data_path=env.paths.data_path,
intermediate_data_path=env.paths.intermediate_data_path,
cache_path=env.paths.cache_path,
osm_file_type="o5m",
osm_file_name=env.paths.planet_o5m,
node_storage=env.node_storage,
user_resource_path=env.paths.user_resource_path,
preprocess=True,
**kwargs,
)
def step_features(env: Env, **kwargs):
if any(x not in WORLDS_NAMES for x in env.countries):
kwargs.update({"generate_packed_borders": True})
if any(x == WORLD_NAME for x in env.countries):
kwargs.update({"generate_world": True})
if len(env.countries) == len(get_all_countries_list(PathProvider.borders_path())):
kwargs.update({"have_borders_for_whole_world": True})
run_gen_tool(
env.gen_tool,
out=env.get_subprocess_out(),
err=env.get_subprocess_out(),
data_path=env.paths.data_path,
intermediate_data_path=env.paths.intermediate_data_path,
cache_path=env.paths.cache_path,
osm_file_type="o5m",
osm_file_name=env.paths.planet_o5m,
node_storage=env.node_storage,
user_resource_path=env.paths.user_resource_path,
cities_boundaries_data=env.paths.cities_boundaries_path,
generate_features=True,
threads_count=settings.THREADS_COUNT_FEATURES_STAGE,
**kwargs,
)
def run_gen_tool_with_recovery_country(env: Env, *args, **kwargs):
if "data_path" not in kwargs or "output" not in kwargs:
logger.warning("The call run_gen_tool() will be without recovery.")
run_gen_tool(*args, **kwargs)
prev_data_path = kwargs["data_path"]
mwm = f"{kwargs['output']}.mwm"
osm2ft = f"{mwm}.osm2ft"
kwargs["data_path"] = env.paths.draft_path
make_symlink(
os.path.join(prev_data_path, osm2ft), os.path.join(env.paths.draft_path, osm2ft)
)
shutil.copy(
os.path.join(prev_data_path, mwm), os.path.join(env.paths.draft_path, mwm)
)
run_gen_tool(*args, **kwargs)
shutil.move(
os.path.join(env.paths.draft_path, mwm), os.path.join(prev_data_path, mwm)
)
kwargs["data_path"] = prev_data_path
@multithread_run_if_one_country
def _generate_common_index(env: Env, country: AnyStr, **kwargs):
run_gen_tool(
env.gen_tool,
out=env.get_subprocess_out(country),
err=env.get_subprocess_out(country),
data_path=env.paths.mwm_path,
intermediate_data_path=env.paths.intermediate_data_path,
cache_path=env.paths.cache_path,
user_resource_path=env.paths.user_resource_path,
node_storage=env.node_storage,
planet_version=env.planet_version,
generate_geometry=True,
generate_index=True,
output=country,
**kwargs,
)
def step_index_world(env: Env, country: AnyStr, **kwargs):
_generate_common_index(
env,
country,
generate_search_index=True,
cities_boundaries_data=env.paths.cities_boundaries_path,
generate_cities_boundaries=True,
**kwargs,
)
def step_cities_ids_world(env: Env, country: AnyStr, **kwargs):
run_gen_tool_with_recovery_country(
env,
env.gen_tool,
out=env.get_subprocess_out(country),
err=env.get_subprocess_out(country),
data_path=env.paths.mwm_path,
user_resource_path=env.paths.user_resource_path,
output=country,
generate_cities_ids=True,
**kwargs,
)
def filter_roads(
name_executable,
in_file,
out_file,
output=subprocess.DEVNULL,
error=subprocess.DEVNULL,
):
osmfilter(
name_executable,
in_file,
out_file,
output=output,
error=error,
keep="",
keep_ways="highway=motorway =trunk =primary =secondary =tertiary",
)
def make_world_road_graph(
name_executable,
path_roads_file,
path_resources,
path_res_file,
logger,
output=subprocess.DEVNULL,
error=subprocess.DEVNULL,
):
world_roads_builder_tool_cmd = [
name_executable,
f"--path_roads_file={path_roads_file}",
f"--path_resources={path_resources}",
f"--path_res_file={path_res_file}",
]
logger.info(f"Starting {' '.join(world_roads_builder_tool_cmd)}")
world_roads_builder_tool = subprocess.Popen(
world_roads_builder_tool_cmd, stdout=output, stderr=error, env=os.environ
)
wait_and_raise_if_fail(world_roads_builder_tool)
def step_prepare_routing_world(env: Env, country: AnyStr, logger, **kwargs):
filter_roads(
env[settings.OSM_TOOL_FILTER],
env.paths.planet_o5m,
env.paths.world_roads_o5m,
env.get_subprocess_out(country),
env.get_subprocess_out(country),
)
make_world_road_graph(
env.world_roads_builder_tool,
env.paths.world_roads_o5m,
env.paths.user_resource_path,
env.paths.world_roads_path,
logger,
env.get_subprocess_out(country),
env.get_subprocess_out(country)
)
def step_routing_world(env: Env, country: AnyStr, **kwargs):
run_gen_tool_with_recovery_country(
env,
env.gen_tool,
out=env.get_subprocess_out(country),
err=env.get_subprocess_out(country),
data_path=env.paths.mwm_path,
user_resource_path=env.paths.user_resource_path,
output=country,
world_roads_path=env.paths.world_roads_path,
**kwargs,
)
def step_index(env: Env, country: AnyStr, **kwargs):
_generate_common_index(env, country, generate_search_index=True, **kwargs)
def step_coastline_index(env: Env, country: AnyStr, **kwargs):
_generate_common_index(env, country, **kwargs)
def step_ugc(env: Env, country: AnyStr, **kwargs):
run_gen_tool_with_recovery_country(
env,
env.gen_tool,
out=env.get_subprocess_out(country),
err=env.get_subprocess_out(country),
data_path=env.paths.mwm_path,
intermediate_data_path=env.paths.intermediate_data_path,
cache_path=env.paths.cache_path,
user_resource_path=env.paths.user_resource_path,
ugc_data=env.paths.ugc_path,
output=country,
**kwargs,
)
def step_popularity(env: Env, country: AnyStr, **kwargs):
run_gen_tool_with_recovery_country(
env,
env.gen_tool,
out=env.get_subprocess_out(country),
err=env.get_subprocess_out(country),
data_path=env.paths.mwm_path,
user_resource_path=env.paths.user_resource_path,
generate_popular_places=True,
output=country,
**kwargs,
)
def step_popularity_world(env: Env, country: AnyStr, **kwargs):
run_gen_tool_with_recovery_country(
env,
env.gen_tool,
out=env.get_subprocess_out(country),
err=env.get_subprocess_out(country),
data_path=env.paths.mwm_path,
user_resource_path=env.paths.user_resource_path,
wikipedia_pages=env.paths.descriptions_path,
idToWikidata=env.paths.id_to_wikidata_path,
generate_popular_places=True,
output=country,
**kwargs,
)
def step_srtm(env: Env, country: AnyStr, **kwargs):
run_gen_tool_with_recovery_country(
env,
env.gen_tool,
out=env.get_subprocess_out(country),
err=env.get_subprocess_out(country),
data_path=env.paths.mwm_path,
intermediate_data_path=env.paths.intermediate_data_path,
cache_path=env.paths.cache_path,
user_resource_path=env.paths.user_resource_path,
srtm_path=env.paths.srtm_path(),
output=country,
**kwargs,
)
def step_isolines_info(env: Env, country: AnyStr, **kwargs):
run_gen_tool_with_recovery_country(
env,
env.gen_tool,
out=env.get_subprocess_out(country),
err=env.get_subprocess_out(country),
data_path=env.paths.mwm_path,
intermediate_data_path=env.paths.intermediate_data_path,
cache_path=env.paths.cache_path,
user_resource_path=env.paths.user_resource_path,
generate_isolines_info=True,
isolines_path=PathProvider.isolines_path(),
output=country,
**kwargs,
)
def step_description(env: Env, country: AnyStr, **kwargs):
run_gen_tool_with_recovery_country(
env,
env.gen_tool,
out=env.get_subprocess_out(country),
err=env.get_subprocess_out(country),
data_path=env.paths.mwm_path,
user_resource_path=env.paths.user_resource_path,
wikipedia_pages=env.paths.descriptions_path,
idToWikidata=env.paths.id_to_wikidata_path,
output=country,
**kwargs,
)
def step_routing(env: Env, country: AnyStr, **kwargs):
run_gen_tool_with_recovery_country(
env,
env.gen_tool,
out=env.get_subprocess_out(country),
err=env.get_subprocess_out(country),
data_path=env.paths.mwm_path,
intermediate_data_path=env.paths.intermediate_data_path,
cache_path=env.paths.cache_path,
user_resource_path=env.paths.user_resource_path,
cities_boundaries_data=env.paths.cities_boundaries_path,
generate_maxspeed=True,
make_city_roads=True,
make_cross_mwm=True,
generate_cameras=True,
make_routing_index=True,
generate_traffic_keys=False,
output=country,
**kwargs,
)
def step_routing_transit(env: Env, country: AnyStr, **kwargs):
run_gen_tool_with_recovery_country(
env,
env.gen_tool,
out=env.get_subprocess_out(country),
err=env.get_subprocess_out(country),
data_path=env.paths.mwm_path,
intermediate_data_path=env.paths.intermediate_data_path,
cache_path=env.paths.cache_path,
user_resource_path=env.paths.user_resource_path,
transit_path=env.paths.transit_path,
transit_path_experimental=env.paths.transit_path_experimental,
make_transit_cross_mwm=True,
make_transit_cross_mwm_experimental=bool(env.paths.transit_path_experimental),
output=country,
**kwargs,
)
def step_statistics(env: Env, country: AnyStr, **kwargs):
run_gen_tool_with_recovery_country(
env,
env.gen_tool,
out=env.get_subprocess_out(country),
err=env.get_subprocess_out(country),
data_path=env.paths.mwm_path,
intermediate_data_path=env.paths.intermediate_data_path,
cache_path=env.paths.cache_path,
user_resource_path=env.paths.user_resource_path,
stats_types=True,
output=country,
**kwargs,
)
with open(os.path.join(env.paths.stats_path, f"{country}.json"), "w") as f:
json.dump(
make_stats(
settings.STATS_TYPES_CONFIG,
os.path.join(env.paths.intermediate_data_path, f"{country}.stats"),
),
f,
)

View file

@ -0,0 +1,55 @@
import logging
from typing import AnyStr
from typing import Iterable
from typing import Optional
from maps_generator.generator import stages_declaration as sd
from maps_generator.generator.env import Env
from maps_generator.generator.generation import Generation
from .generator.stages import Stage
logger = logging.getLogger("maps_generator")
def run_generation(
env: Env,
stages: Iterable[Stage],
from_stage: Optional[AnyStr] = None,
build_lock: bool = True,
):
generation = Generation(env, build_lock)
for s in stages:
generation.add_stage(s)
generation.run(from_stage)
def generate_maps(env: Env, from_stage: Optional[AnyStr] = None):
""""Runs maps generation."""
stages = (
sd.StageDownloadAndConvertPlanet(),
sd.StageUpdatePlanet(),
sd.StageCoastline(),
sd.StagePreprocess(),
sd.StageFeatures(),
sd.StageDownloadDescriptions(),
sd.StageMwm(),
sd.StageCountriesTxt(),
sd.StageLocalAds(),
sd.StageStatistics(),
sd.StageCleanup(),
)
run_generation(env, stages, from_stage)
def generate_coasts(env: Env, from_stage: Optional[AnyStr] = None):
"""Runs coasts generation."""
stages = (
sd.StageDownloadAndConvertPlanet(),
sd.StageUpdatePlanet(),
sd.StageCoastline(use_old_if_fail=False),
sd.StageCleanup(),
)
run_generation(env, stages, from_stage)

View file

@ -0,0 +1,8 @@
omim-data-all
omim-data-files
omim-descriptions
omim-post_generation
filelock==3.0.10
beautifulsoup4==4.9.1
requests>=2.31.0
requests_file==1.5.1

View file

@ -0,0 +1,6 @@
-r ../post_generation/requirements_dev.txt
-r ../descriptions/requirements_dev.txt
filelock==3.0.10
beautifulsoup4==4.9.1
requests>=2.31.0
requests_file==1.5.1

View file

@ -0,0 +1,37 @@
#!/usr/bin/env python3
import os
import sys
import setuptools
module_dir = os.path.abspath(os.path.dirname(__file__))
sys.path.insert(0, os.path.join(module_dir, "..", "..", ".."))
from pyhelpers.setup import chdir
from pyhelpers.setup import get_version
from pyhelpers.setup import get_requirements
with chdir(os.path.abspath(os.path.dirname(__file__))):
setuptools.setup(
name="omim-maps_generator",
version=str(get_version()),
author="CoMaps",
author_email="info@comaps.app",
description="This package contains tools for maps generation.",
url="https://codeberg.org/comaps",
package_dir={"maps_generator": ""},
package_data={"": ["var/**/*"]},
packages=[
"maps_generator",
"maps_generator.generator",
"maps_generator.utils",
"maps_generator.checks"
],
classifiers=[
"Programming Language :: Python :: 3",
"License :: OSI Approved :: Apache Software License",
],
python_requires=">=3.6",
install_requires=get_requirements(),
)

View file

@ -0,0 +1,131 @@
import datetime
import logging
import os
import re
import tempfile
import unittest
from collections import Counter
from maps_generator.checks.logs import logs_reader
class TestLogsReader(unittest.TestCase):
def setUp(self):
self.dir = tempfile.TemporaryDirectory()
with open(
os.path.join(self.dir.name, "Czech_Jihovychod_Jihomoravsky kraj.log"), "w"
) as file:
file.write(LOG_STRING)
logs = list(logs_reader.LogsReader(self.dir.name))
self.assertEqual(len(logs), 1)
self.log = logs[0]
def tearDown(self):
self.dir.cleanup()
def test_read_logs(self):
self.assertTrue(self.log.name.startswith("Czech_Jihovychod_Jihomoravsky kraj"))
self.assertTrue(self.log.is_mwm_log)
self.assertFalse(self.log.is_stage_log)
self.assertEqual(len(self.log.lines), 46)
def test_split_into_stages(self):
st = logs_reader.split_into_stages(self.log)
self.assertEqual(len(st), 4)
names_counter = Counter(s.name for s in st)
self.assertEqual(
names_counter,
Counter({"Routing": 1, "RoutingTransit": 1, "MwmStatistics": 2}),
)
def test_split_and_normalize_logs(self):
st = logs_reader.normalize_logs(logs_reader.split_into_stages(self.log))
self.assertEqual(len(st), 3)
m = {s.name: s for s in st}
self.assertEqual(
m["MwmStatistics"].duration, datetime.timedelta(seconds=3.628742)
)
def test_count_levels(self):
st = logs_reader.normalize_logs(logs_reader.split_into_stages(self.log))
self.assertEqual(len(st), 3)
m = {s.name: s for s in st}
c = logs_reader.count_levels(m["Routing"])
self.assertEqual(c, Counter({logging.INFO: 22, logging.ERROR: 1}))
c = logs_reader.count_levels(self.log.lines)
self.assertEqual(c, Counter({logging.INFO: 45, logging.ERROR: 1}))
def test_find_and_parse(self):
st = logs_reader.normalize_logs(logs_reader.split_into_stages(self.log))
self.assertEqual(len(st), 3)
m = {s.name: s for s in st}
pattern_str = (
r".*Leaps finished, elapsed: [0-9.]+ seconds, routes found: "
r"(?P<routes_found>\d+) , not found: (?P<routes_not_found>\d+)$"
)
for found in (
logs_reader.find_and_parse(m["Routing"], pattern_str),
logs_reader.find_and_parse(self.log.lines, re.compile(pattern_str)),
):
self.assertEqual(len(found), 1)
line = found[0]
self.assertEqual(
line[0], {"routes_found": "996363", "routes_not_found": "126519"}
)
if __name__ == "main":
unittest.main()
LOG_STRING = """
[2020-05-24 04:19:37,032] INFO stages Stage Routing: start ...
[2020-05-24 04:19:37,137] INFO gen_tool Run generator tool [generator_tool version 1590177464 f52c6496c4d90440f2e0d8088acdb3350dcf7c69]: /home/Projects/build-omim-Desktop_Qt_5_10_1_GCC_64bit-Release/generator_tool --threads_count=1 --data_path=/home/maps_build/2020_05_23__16_58_17/draft --intermediate_data_path=/home/maps_build/2020_05_23__16_58_17/intermediate_data --user_resource_path=/home/Projects/omim/data --cities_boundaries_data=/home/maps_build/2020_05_23__16_58_17/intermediate_data/cities_boundaries.bin --generate_maxspeed=true --make_city_roads=true --make_cross_mwm=true --generate_cameras=true --make_routing_index=true --generate_traffic_keys=true --output=Czech_Jihovychod_Jihomoravsky kraj
LOG TID(1) INFO 3.29e-06 Loaded countries list for version: 200402
LOG TID(1) INFO 7.945e-05 generator/camera_info_collector.cpp:339 BuildCamerasInfo() Generating cameras info for /home/maps_build/2020_05_23__16_58_17/draft/Czech_Jihovychod_Jihomoravsky kraj.mwm
LOG TID(1) INFO 0.529856 generator/routing_index_generator.cpp:546 BuildRoutingIndex() Building routing index for /home/maps_build/2020_05_23__16_58_17/draft/Czech_Jihovychod_Jihomoravsky kraj.mwm
LOG TID(1) INFO 2.11074 generator/routing_index_generator.cpp:563 BuildRoutingIndex() Routing section created: 639872 bytes, 163251 roads, 193213 joints, 429334 points
LOG TID(1) INFO 2.90872 generator/restriction_generator.cpp:117 SerializeRestrictions() Routing restriction info: RestrictionHeader: { No => 430, Only => 284, NoUTurn => 123, OnlyUTurn => 0 }
LOG TID(1) INFO 3.00342 generator/road_access_generator.cpp:799 BuildRoadAccessInfo() Generating road access info for /home/maps_build/2020_05_23__16_58_17/draft/Czech_Jihovychod_Jihomoravsky kraj.mwm
LOG TID(1) INFO 3.77435 generator_tool/generator_tool.cpp:621 operator()() Generating cities boundaries roads for /home/maps_build/2020_05_23__16_58_17/draft/Czech_Jihovychod_Jihomoravsky kraj.mwm
LOG TID(1) INFO 3.85993 generator/city_roads_generator.cpp:51 LoadCitiesBoundariesGeometry() Read: 14225 boundaries from: /home/maps_build/2020_05_23__16_58_17/intermediate_data/routing_city_boundaries.bin
LOG TID(1) INFO 6.82577 routing/city_roads_serialization.hpp:78 Serialize() Serialized 81697 road feature ids in cities. Size: 77872 bytes.
LOG TID(1) INFO 6.82611 generator_tool/generator_tool.cpp:621 operator()() Generating maxspeeds section for /home/maps_build/2020_05_23__16_58_17/draft/Czech_Jihovychod_Jihomoravsky kraj.mwm
LOG TID(1) INFO 6.82616 generator/maxspeeds_builder.cpp:186 BuildMaxspeedsSection() BuildMaxspeedsSection( /home/maps_build/2020_05_23__16_58_17/draft/Czech_Jihovychod_Jihomoravsky kraj.mwm , /home/maps_build/2020_05_23__16_58_17/draft/Czech_Jihovychod_Jihomoravsky kraj.mwm.osm2ft , /home/maps_build/2020_05_23__16_58_17/intermediate_data/maxspeeds.csv )
LOG TID(1) INFO 7.58621 routing/maxspeeds_serialization.hpp:144 Serialize() Serialized 11413 forward maxspeeds and 302 bidirectional maxspeeds. Section size: 17492 bytes.
LOG TID(1) INFO 7.58623 generator/maxspeeds_builder.cpp:172 SerializeMaxspeeds() SerializeMaxspeeds( /home/maps_build/2020_05_23__16_58_17/draft/Czech_Jihovychod_Jihomoravsky kraj.mwm , ...) serialized: 11715 maxspeed tags.
LOG TID(1) INFO 7.64526 generator/routing_index_generator.cpp:596 BuildRoutingCrossMwmSection() Building cross mwm section for Czech_Jihovychod_Jihomoravsky kraj
LOG TID(1) INFO 8.43521 generator/routing_index_generator.cpp:393 CalcCrossMwmConnectors() Transitions finished, transitions: 1246 , elapsed: 0.789908 seconds
LOG TID(1) INFO 8.48956 generator/routing_index_generator.cpp:411 CalcCrossMwmConnectors() Pedestrian model. Number of enters: 1233 Number of exits: 1233
LOG TID(1) INFO 8.48964 generator/routing_index_generator.cpp:411 CalcCrossMwmConnectors() Bicycle model. Number of enters: 1231 Number of exits: 1230
LOG TID(1) INFO 8.48964 generator/routing_index_generator.cpp:411 CalcCrossMwmConnectors() Car model. Number of enters: 1089 Number of exits: 1089
LOG TID(1) INFO 8.48965 generator/routing_index_generator.cpp:411 CalcCrossMwmConnectors() Transit model. Number of enters: 0 Number of exits: 0
LOG TID(1) INFO 4241.68 generator/routing_index_generator.cpp:537 FillWeights() Leaps finished, elapsed: 4233.19 seconds, routes found: 996363 , not found: 126519
LOG TID(1) INFO 4241.8 generator/routing_index_generator.cpp:588 SerializeCrossMwm() Cross mwm section generated, size: 1784214 bytes
LOG TID(1) ERROR 4243.2 generator/routing_index_generator.cpp:588 SerializeCrossMwm() Fake error.
[2020-05-24 05:30:19,319] INFO stages Stage Routing: finished in 1:10:42.287364
[2020-05-24 05:30:19,319] INFO stages Stage RoutingTransit: start ...
[2020-05-24 05:30:19,485] INFO gen_tool Run generator tool [generator_tool version 1590177464 f52c6496c4d90440f2e0d8088acdb3350dcf7c69]: /home/Projects/build-omim-Desktop_Qt_5_10_1_GCC_64bit-Release/generator_tool --threads_count=1 --data_path=/home/maps_build/2020_05_23__16_58_17/draft --intermediate_data_path=/home/maps_build/2020_05_23__16_58_17/intermediate_data --user_resource_path=/home/Projects/omim/data --transit_path=/home/maps_build/2020_05_23__16_58_17/intermediate_data --make_transit_cross_mwm=true --output=Czech_Jihovychod_Jihomoravsky kraj
LOG TID(1) INFO 3.107e-06 Loaded countries list for version: 200402
LOG TID(1) INFO 6.0315e-05 generator/transit_generator.cpp:205 BuildTransit() Building transit section for Czech_Jihovychod_Jihomoravsky kraj mwmDir: /home/maps_build/2020_05_23__16_58_17/draft/
LOG TID(1) INFO 5.40151 generator/routing_index_generator.cpp:617 BuildTransitCrossMwmSection() Building transit cross mwm section for Czech_Jihovychod_Jihomoravsky kraj
LOG TID(1) INFO 5.47317 generator/routing_index_generator.cpp:320 CalcCrossMwmTransitions() Transit cross mwm section is not generated because no transit section in mwm: /home/maps_build/2020_05_23__16_58_17/draft/Czech_Jihovychod_Jihomoravsky kraj.mwm
LOG TID(1) INFO 5.4732 generator/routing_index_generator.cpp:393 CalcCrossMwmConnectors() Transitions finished, transitions: 0 , elapsed: 0.0716537 seconds
LOG TID(1) INFO 5.47321 generator/routing_index_generator.cpp:411 CalcCrossMwmConnectors() Pedestrian model. Number of enters: 0 Number of exits: 0
LOG TID(1) INFO 5.47321 generator/routing_index_generator.cpp:411 CalcCrossMwmConnectors() Bicycle model. Number of enters: 0 Number of exits: 0
LOG TID(1) INFO 5.47322 generator/routing_index_generator.cpp:411 CalcCrossMwmConnectors() Car model. Number of enters: 0 Number of exits: 0
LOG TID(1) INFO 5.47322 generator/routing_index_generator.cpp:411 CalcCrossMwmConnectors() Transit model. Number of enters: 0 Number of exits: 0
LOG TID(1) INFO 5.47325 generator/routing_index_generator.cpp:588 SerializeCrossMwm() Cross mwm section generated, size: 31 bytes
[2020-05-24 05:30:25,144] INFO stages Stage RoutingTransit: finished in 0:00:05.824967
[2020-05-24 05:30:25,144] INFO stages Stage MwmStatistics: start ...
[2020-05-24 05:30:25,212] INFO gen_tool Run generator tool [generator_tool version 1590177464 f52c6496c4d90440f2e0d8088acdb3350dcf7c69]: /home/Projects/build-omim-Desktop_Qt_5_10_1_GCC_64bit-Release/generator_tool --threads_count=1 --data_path=/home/maps_build/2020_05_23__16_58_17/draft --intermediate_data_path=/home/maps_build/2020_05_23__16_58_17/intermediate_data --user_resource_path=/home/Projects/omim/data --stats_types=true --output=Czech_Jihovychod_Jihomoravsky kraj
LOG TID(1) INFO 1.5806e-05 generator_tool/generator_tool.cpp:621 operator()() Calculating type statistics for /home/maps_build/2020_05_23__16_58_17/draft/Czech_Jihovychod_Jihomoravsky kraj.mwm
[2020-05-24 05:30:28,773] INFO stages Stage MwmStatistics: finished in 0:00:03.628742
[2020-05-24 06:30:25,144] INFO stages Stage MwmStatistics: start ...
[2020-05-24 06:30:25,212] INFO gen_tool Run generator tool [generator_tool version 1590177464 f52c6496c4d90440f2e0d8088acdb3350dcf7c69]: /home/Projects/build-omim-Desktop_Qt_5_10_1_GCC_64bit-Release/generator_tool --threads_count=1 --data_path=/home/maps_build/2020_05_23__16_58_17/draft --intermediate_data_path=/home/maps_build/2020_05_23__16_58_17/intermediate_data --user_resource_path=/home/Projects/omim/data --stats_types=true --output=Czech_Jihovychod_Jihomoravsky kraj
LOG TID(1) INFO 1.5806e-05 generator_tool/generator_tool.cpp:621 operator()() Calculating type statistics for /home/maps_build/2020_05_23__16_58_17/draft/Czech_Jihovychod_Jihomoravsky kraj.mwm
[2020-05-24 06:30:28,773] INFO stages Stage MwmStatistics: finished in 0:00:01.628742
"""

View file

@ -0,0 +1,44 @@
import argparse
from multiprocessing.pool import ThreadPool
from typing import Tuple
from maps_generator.checks.logs import logs_reader
def get_args():
parser = argparse.ArgumentParser(
description="This script generates file with countries that are "
"ordered by time needed to generate them."
)
parser.add_argument(
"--output", type=str, required=True, help="Path to output file.",
)
parser.add_argument(
"--logs", type=str, required=True, help="Path to logs directory.",
)
return parser.parse_args()
def process_log(log: logs_reader.Log) -> Tuple[str, float]:
stage_logs = logs_reader.split_into_stages(log)
stage_logs = logs_reader.normalize_logs(stage_logs)
d = sum(s.duration.total_seconds() for s in stage_logs if s.duration is not None)
return log.name, d
def main():
args = get_args()
with ThreadPool() as pool:
order = pool.map(
process_log,
(log for log in logs_reader.LogsReader(args.logs) if log.is_mwm_log),
)
order.sort(key=lambda v: v[1], reverse=True)
with open(args.output, "w") as out:
out.write("# Mwm name\tGeneration time\n")
out.writelines("{}\t{}\n".format(*line) for line in order)
if __name__ == "__main__":
main()

View file

@ -0,0 +1,19 @@
import re
from datetime import timedelta
DURATION_PATTERN = re.compile(
r"((?P<days>[-\d]+) day[s]*, )?(?P<hours>\d+):(?P<minutes>\d+):(?P<seconds>\d[\.\d+]*)"
)
def unique(s):
seen = set()
seen_add = seen.add
return [x for x in s if not (x in seen or seen_add(x))]
def parse_timedelta(s):
m = DURATION_PATTERN.match(s)
d = m.groupdict()
return timedelta(**{k: float(d[k]) for k in d if d[k] is not None})

View file

@ -0,0 +1,201 @@
import errno
import functools
import glob
import logging
import os
import shutil
from functools import partial
from multiprocessing.pool import ThreadPool
from typing import AnyStr
from typing import Dict
from typing import List
from typing import Optional
from urllib.parse import unquote
from urllib.parse import urljoin
from urllib.parse import urlparse
from urllib.request import url2pathname
import requests
from bs4 import BeautifulSoup
from requests_file import FileAdapter
from maps_generator.utils.md5 import check_md5
from maps_generator.utils.md5 import md5_ext
logger = logging.getLogger("maps_generator")
def is_file_uri(url: AnyStr) -> bool:
return urlparse(url).scheme == "file"
def file_uri_to_path(url : AnyStr) -> AnyStr:
file_uri = urlparse(url)
file_path = file_uri.path
# URI is something like "file://~/..."
if file_uri.netloc == '~':
file_path = f'~{file_uri.path}'
return os.path.expanduser(file_path)
return file_path
def is_executable(fpath: AnyStr) -> bool:
return fpath is not None and os.path.isfile(fpath) and os.access(fpath, os.X_OK)
@functools.lru_cache()
def find_executable(path: AnyStr, exe: Optional[AnyStr] = None) -> AnyStr:
if exe is None:
if is_executable(path):
return path
else:
raise FileNotFoundError(path)
find_pattern = f"{path}/**/{exe}"
for name in glob.iglob(find_pattern, recursive=True):
if is_executable(name):
return name
raise FileNotFoundError(f"{exe} not found in {path}")
def download_file(url: AnyStr, name: AnyStr, download_if_exists: bool = True):
logger.info(f"Trying to download {name} from {url}.")
if not download_if_exists and os.path.exists(name):
logger.info(f"File {name} already exists.")
return
if is_file_uri(url):
# url uses 'file://' scheme
shutil.copy2(file_uri_to_path(url), name)
logger.info(f"File {name} was copied from {url}.")
return
tmp_name = f"{name}__"
os.makedirs(os.path.dirname(tmp_name), exist_ok=True)
with requests.Session() as session:
session.mount("file://", FileAdapter())
with open(tmp_name, "wb") as handle:
response = session.get(url, stream=True)
file_length = None
try:
file_length = int(response.headers["Content-Length"])
except KeyError:
logger.warning(
f"There is no attribute Content-Length in headers [{url}]: {response.headers}"
)
current = 0
max_attempts = 32
attempts = max_attempts
while attempts:
for data in response.iter_content(chunk_size=4096):
current += len(data)
handle.write(data)
if file_length is None or file_length == current:
break
logger.warning(
f"Download interrupted. Resuming download from {url}: {current}/{file_length}."
)
headers = {"Range": f"bytes={current}-"}
response = session.get(url, headers=headers, stream=True)
attempts -= 1
assert (
attempts > 0
), f"Maximum failed resuming download attempts of {max_attempts} is exceeded."
shutil.move(tmp_name, name)
logger.info(f"File {name} was downloaded from {url}.")
def is_dir(url) -> bool:
return url.endswith("/")
def find_files(url) -> List[AnyStr]:
def files_list_file_scheme(path, results=None):
if results is None:
results = []
for p in os.listdir(path):
new_path = os.path.join(path, p)
if os.path.isdir(new_path):
files_list_file_scheme(new_path, results)
else:
results.append(new_path)
return results
def files_list_http_scheme(url, results=None):
if results is None:
results = []
page = requests.get(url).content
bs = BeautifulSoup(page, "html.parser")
links = bs.findAll("a", href=True)
for link in links:
href = link["href"]
if href == "./" or href == "../":
continue
new_url = urljoin(url, href)
if is_dir(new_url):
files_list_http_scheme(new_url, results)
else:
results.append(new_url)
return results
parse_result = urlparse(url)
if parse_result.scheme == "file":
return [
f.replace(parse_result.path, "")
for f in files_list_file_scheme(parse_result.path)
]
if parse_result.scheme == "http" or parse_result.scheme == "https":
return [f.replace(url, "") for f in files_list_http_scheme(url)]
assert False, parse_result
def normalize_url_to_path_dict(
url_to_path: Dict[AnyStr, AnyStr]
) -> Dict[AnyStr, AnyStr]:
for url in list(url_to_path.keys()):
if is_dir(url):
path = url_to_path[url]
del url_to_path[url]
for rel_path in find_files(url):
abs_url = urljoin(url, rel_path)
url_to_path[abs_url] = unquote(os.path.join(path, rel_path))
return url_to_path
def download_files(url_to_path: Dict[AnyStr, AnyStr], download_if_exists: bool = True):
with ThreadPool() as pool:
pool.starmap(
partial(download_file, download_if_exists=download_if_exists),
url_to_path.items(),
)
def is_exists_file_and_md5(name: AnyStr) -> bool:
return os.path.isfile(name) and os.path.isfile(md5_ext(name))
def is_verified(name: AnyStr) -> bool:
return is_exists_file_and_md5(name) and check_md5(name, md5_ext(name))
def make_symlink(target: AnyStr, link_name: AnyStr):
try:
os.symlink(target, link_name)
except OSError as e:
if e.errno == errno.EEXIST:
if os.path.islink(link_name):
link = os.readlink(link_name)
if os.path.abspath(target) != os.path.abspath(link):
raise e
else:
raise e
else:
raise e

View file

@ -0,0 +1,34 @@
import logging
logger = logging.getLogger("maps_generator")
class DummyObject:
def __getattr__(self, name):
return lambda *args: None
def create_file_handler(
file,
level=logging.DEBUG,
formatter=None
):
if formatter is None and logger.hasHandlers():
formatter = logger.handlers[0].formatter
handler = logging.FileHandler(file)
handler.setLevel(level)
handler.setFormatter(formatter)
return handler
def create_file_logger(
file,
level=logging.DEBUG,
formatter=None
):
_logger = logging.getLogger(file)
_logger.setLevel(level)
handler = create_file_handler(file, level, formatter)
_logger.addHandler(handler)
return _logger, handler

View file

@ -0,0 +1,31 @@
import hashlib
def md5sum(name, block_size=4096):
d = hashlib.md5()
with open(name, mode="rb") as f:
buf = f.read(block_size)
while len(buf) > 0:
d.update(buf)
buf = f.read(block_size)
return d.hexdigest()
def write_md5sum(fname, name):
with open(name, mode="w") as f:
md5 = md5sum(fname)
f.write(md5)
def check_md5(fname, name):
h = md5sum(fname)
with open(name, "r") as f:
data = f.read()
assert len(data) != 0, f"The file {name} is empty"
if data.split()[0] == h:
return True
return False
def md5_ext(name):
return f"{name}.md5"

View file

@ -0,0 +1,9 @@
import os
import sys
def total_virtual_memory():
if sys.platform.startswith("linux"):
return os.sysconf("SC_PAGE_SIZE") * os.sysconf("SC_PHYS_PAGES")
else:
return 0

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,109 @@
[Developer]
# Path to the `organicmaps` source code repository:
OMIM_PATH: ~/OM/organicmaps
# A path with the generator_tool binary:
BUILD_PATH: ${Developer:OMIM_PATH}/../omim-build-release
[Main]
# A special small planet file will be downloaded if DEBUG is set to 1.
DEBUG: 0
# A main working directory. There is a subdirectory created for each generator run
# which contains the planet and other downloads, temporary build files, logs and completed MWMs.
MAIN_OUT_PATH: ${Developer:OMIM_PATH}/../maps_build
# Path for storing caches for nodes, ways, relations.
# If it's not set then caches are stored inside the directory of the current build.
# CACHE_PATH: ${Main:MAIN_OUT_PATH}/cache
[Generator tool]
# Path to the data/ folder in the repository:
USER_RESOURCE_PATH: ${Developer:OMIM_PATH}/data
# Features stage only parallelism level. Set to 0 for auto detection.
THREADS_COUNT_FEATURES_STAGE: 0
# How to store all nodes with their coords.
# "map" (default) - fast, suitable to generate a few countries, but is not suitable for the whole planet
# "mem" - fastest, best for the whole planet generation, needs ~100GB memory (as of 2025)
# "raw" - read from a mmapped file, slow, but uses the least memory
NODE_STORAGE: map
[Osm tools]
# Path to osmctools binaries:
OSM_TOOLS_PATH: ${Developer:OMIM_PATH}/../osmctools
# If the binaries are not found neither in the configured path nor system-wide,
# then the tools are built from the sources:
OSM_TOOLS_SRC_PATH: ${Developer:OMIM_PATH}/tools/osmctools
[Logging]
# maps_generator's general (python output only) log file path and name.
# More detailed logs that include output of the `generator_tool` binary
# are located in the `logs/` subdir of a particular build directory,
# e.g. `maps_build/2023_06_04__20_05_07/logs/`.
LOG_FILE_PATH: ${Main:MAIN_OUT_PATH}/generation.log
[External]
# Planet file location. It should be a dump of OSM data in osm.pbf format.
# By default its an entire planet from "planet.openstreetmap.org".
# Or set it to a particular country/region extract from e.g. [Geofabrik](http://download.geofabrik.de/index.html).
# Note that an entire planet generation takes 40+ hours on a 256GB RAM server (and 1TB+ disk space).
# Stick to smaller extracts unless you have a machine this large.
# Here and further, its possible to specify either an URL (to be downloaded automatically)
# or a local file path like file:///path/to/file.
# A sample URL to download a latest OSM dump for North Macedonia:
PLANET_URL: https://download.geofabrik.de/europe/macedonia-latest.osm.pbf
# Location of the md5 checksum of the planet file:
PLANET_MD5_URL: ${External:PLANET_URL}.md5
# A base url to the latest_coasts.geom and latest_coasts.rawgeom files.
# For example, if PLANET_COASTS_URL = https://somesite.com/download/
# then the https://somesite.com/download/latest_coasts.geom url will be used to download latest_coasts.geom and
# the https://somesite.com/download/latest_coasts.rawgeom url will be used to download latest_coasts.rawgeom.
# Comment to skip getting the coastlines files.
# PLANET_COASTS_URL:
# Should be 'true' for an entire planet build to make a special routing section in World.mwm
# for alerting about absent regions without which the route can't be built.
NEED_BUILD_WORLD_ROADS: false
# Subway file location, see docs/SUBWAY_GENERATION.md if you want to generate your own file.
# Comment to disable subway layer generation.
#SUBWAY_URL: https://cdn.organicmaps.app/subway.json
# Location of the EXPERIMENTAL GTFS-extracted public transport transit files:
# TRANSIT_URL:
# Urls for production maps generation.
# UGC_URL:
# HOTELS_URL:
# PROMO_CATALOG_CITIES:
# POPULARITY_URL:
# FOOD_URL:
# FOOD_TRANSLATIONS_URL:
# SRTM_PATH:
# ISOLINES_PATH:
# ADDRESSES_PATH:
# Local path (not url!) to .csv files.
# UK_POSTCODES_URL:
# US_POSTCODES_URL:
[Stages]
# Set to 1 to update the entire OSM planet file (as taken from "planet.openstreetmap.org")
# via an osmupdate tool before the generation. Not for use with partial planet extracts.
NEED_PLANET_UPDATE: 0
# If you want to calculate diffs you need to specify where the old maps are,
# e.g. ${Main:MAIN_OUT_PATH}/2021_03_16__09_00_00/
DATA_ARCHIVE_DIR: ${Generator tool:USER_RESOURCE_PATH}
# How many versions in the archive to use for diff calculation:
DIFF_VERSION_DEPTH: 2
[Common]
# Default parallelism level for the most of jobs. Set to 0 for auto detection.
THREADS_COUNT: 0
[Stats]
# Path to rules for calculating statistics by type:
STATS_TYPES_CONFIG: ${Developer:OMIM_PATH}/tools/python/maps_generator/var/etc/stats_types_config.txt

View file

@ -0,0 +1,107 @@
[Developer]
# Path to the `comaps` source code repository:
OMIM_PATH: ~/comaps
# A path with the generator_tool binary:
BUILD_PATH: ~/omim-build-relwithdebinfo
[Main]
# A special small planet file will be downloaded if DEBUG is set to 1.
DEBUG: 0
# A main working directory. There is a subdirectory created for each generator run
# which contains the planet and other downloads, temporary build files, logs and completed MWMs.
MAIN_OUT_PATH: /mnt/4tbexternal/osm-maps
# Path for storing caches for nodes, ways, relations.
# If it's not set then caches are stored inside the directory of the current build.
# CACHE_PATH: ${Main:MAIN_OUT_PATH}/cache
[Generator tool]
# Path to the data/ folder in the repository:
USER_RESOURCE_PATH: ${Developer:OMIM_PATH}/data
# Features stage only parallelism level. Set to 0 for auto detection.
THREADS_COUNT_FEATURES_STAGE: 0
# Do not change it. This is determined automatically.
NODE_STORAGE: mem
[Osm tools]
# Path to osmctools binaries:
OSM_TOOLS_PATH: /usr/bin/
# If the binaries are not found neither in the configured path nor system-wide,
# then the tools are built from the sources:
OSM_TOOLS_SRC_PATH: ${Developer:OMIM_PATH}/tools/osmctools
[Logging]
# maps_generator's general (python output only) log file path and name.
# More detailed logs that include output of the `generator_tool` binary
# are located in the `logs/` subdir of a particular build directory,
# e.g. `maps_build/2023_06_04__20_05_07/logs/`.
LOG_FILE_PATH: ${Main:MAIN_OUT_PATH}/generation.log
[External]
# Planet file location. It should be a dump of OSM data in osm.pbf format.
# By default its an entire planet from "planet.openstreetmap.org".
# Or set it to a particular country/region extract from e.g. [Geofabrik](http://download.geofabrik.de/index.html).
# Note that an entire planet generation takes 40+ hours on a 256GB RAM server (and 1TB+ disk space).
# Stick to smaller extracts unless you have a machine this large.
# Here and further, its possible to specify either an URL (to be downloaded automatically)
# or a local file path like file:///path/to/file.
# A sample URL to download a latest OSM dump for North Macedonia:
PLANET_URL: file:///home/planet/planet/planet.o5m
# Location of the md5 checksum of the planet file:
PLANET_MD5_URL: ${External:PLANET_URL}.md5
# A base url to the latest_coasts.geom and latest_coasts.rawgeom files.
# For example, if PLANET_COASTS_URL = https://somesite.com/download/
# then the https://somesite.com/download/latest_coasts.geom url will be used to download latest_coasts.geom and
# the https://somesite.com/download/latest_coasts.rawgeom url will be used to download latest_coasts.rawgeom.
# Comment to skip getting the coastlines files.
PLANET_COASTS_URL: file:///home/planet/
# Should be 'true' for an entire planet build to make a special routing section in World.mwm
# for alerting about absent regions without which the route can't be built.
# NEED_BUILD_WORLD_ROADS: true
# Subway file location, see docs/SUBWAY_GENERATION.md if you want to generate your own file.
# Comment to disable subway layer generation.
SUBWAY_URL: file:///home/planet/subway/subways.transit.json
# Location of the EXPERIMENTAL GTFS-extracted public transport transit files:
# TRANSIT_URL:
# Urls for production maps generation.
# UGC_URL:
# HOTELS_URL:
# PROMO_CATALOG_CITIES:
# POPULARITY_URL:
# FOOD_URL:
# FOOD_TRANSLATIONS_URL:
SRTM_PATH: /home/planet/SRTM-patched-europe/
ISOLINES_PATH: /home/planet/isolines/
ADDRESSES_PATH: /home/planet/tiger/
# Local path (not url!) to .csv files.
UK_POSTCODES_URL: /home/planet/postcodes/gb-postcode-data/gb_postcodes.csv
US_POSTCODES_URL: /home/planet/postcodes/us-postcodes/uszips.csv
[Stages]
# Set to 1 to update the entire OSM planet file (as taken from "planet.openstreetmap.org")
# via an osmupdate tool before the generation. Not for use with partial planet extracts.
NEED_PLANET_UPDATE: 0
# If you want to calculate diffs you need to specify where the old maps are,
# e.g. ${Main:MAIN_OUT_PATH}/2021_03_16__09_00_00/
DATA_ARCHIVE_DIR: ${Generator tool:USER_RESOURCE_PATH}
# How many versions in the archive to use for diff calculation:
DIFF_VERSION_DEPTH: 2
[Common]
# Default parallelism level for the most of jobs. Set to 0 for auto detection.
THREADS_COUNT: 0
[Stats]
# Path to rules for calculating statistics by type:
STATS_TYPES_CONFIG: ${Developer:OMIM_PATH}/tools/python/maps_generator/var/etc/stats_types_config.txt

View file

@ -0,0 +1,76 @@
# This file is used to calculate statistics by type.
# File format:
# Regular expression of type;metric type;statistic name
#
# Regular expression of type
# Types you can find in data/mapcss-mapping.csv.
# You must replace the character '|' with the '-' character.
#
# Metric type
# There can be three types of metrics:
# 1) cnt - to count the number of objects
# 2) len - to calculate the total length of objects
# 3) area - to calculate the total area of objects
#
# Statistic name
# The name of the statistics will be written to the resulting file.
barrier-(fence|gate);len;Fences
building;cnt;Building
(amenity|shop|historic)-.*;cnt;POI
(amenity|shop|historic)-.*;cnt_names;POI with names
amenity-(cafe|restaurant|fast_food).*;cnt;Cafes and restaurants
amenity-(pub|bar);cnt;Bars and pubs
amenity-kindergarten;cnt;Kindergartens
amenity-(school|university|college);cnt;Schools and universities
amenity-parking.*;cnt;Parking lots
amenity-parking.*;area;Parking lots
amenity-pharmacy;cnt;Pharmacies
amenity-place_of_worship.*;cnt;Temples
amenity-(hospital|doctors);cnt;Hospitals and clinics
amenity-toilets;cnt;Toilets
amenity-(waste_disposal|recycling);cnt;Garbage bins
highway-(motorway|trunk|primary|secondary|tertiary|residential|unclassified|service|track|living_street)(_link)?(-.*)?;len;Road network
highway-(footway|path|pedestrian|steps).*;len;Footpaths
highway-.*-bridge;len;Bridges
highway-.*-tunnel;len;Tunnels
highway-(footway|path|steps)-bridge;len;Pedestrian bridges
highway-(footway|path|steps)-tunnel;len;Pedestrian tunnels
highway-steps.*;len;Stairs
highway-speed_camera;cnt;Speed cameras
internet_access-wlan;cnt;Wi-Fi access points
leisure-(pitch|stadium|playing_fields|track|sports_centre).*;cnt;Sports grounds and complexes
leisure-playground;cnt;Playgrounds
man_made-lighthouse;cnt;Lighthouses
man_made-windmill;cnt;Windmills
man_made-pipeline.*;len;Pipelines
natural-beach;cnt;Beaches
natural-tree;cnt;Trees
natural-waterfall;cnt;Waterfalls
piste:type.*;len;Ski trails
place-(city.*|town|village|hamlet);cnt;Settlements
place-island;cnt;Islands
power-(minor_)?line.*;len;Power lines
power-(pole|tower);cnt;Power Line Supports
railway-(rail|monorail|light_rail|narrow_gauge|preserved|siding|spur|yard|disused|incline).*;len;Railways
railway-.*-(bridge|tunnel);len;Railway bridges and tunnels
railway-(razed|abandoned).*;len;Abandoned railways
railway-narrow_gauge.*;len;Narrow gauge railways
railway-tram(-.*)?;len;Tram rails
railway-(halt|station);cnt;Railway stations
railway-subway.*;len;Subway lines
highway-bus_stop|railway-tram_stop;cnt;Ground Transportation Stops
shop-bakery;cnt;Bakeries
shop-books;cnt;Book stores
shop-clothes;cnt;Clothing stores
shop-shoes;cnt;Shoe stores
shop-(convenience|supermarket);cnt;Grocery stores
shop-florist;cnt;Flower shops
shop-(hairdresser|beauty);cnt;Hairdressers and beauty salons
tourism-(guest_house|hos?tel|motel);cnt;Hotels and hostels
tourism-(attraction|viewpoint);cnt;Attractions and viewpoints
waterway-(canal|river|stream)(-.*)?;len;Rivers, canals and streams
landuse-cemetery.*;area;Cemeteries
leisure-park.*;area;Parks
natural-beach;area;Beaches
sponsored-booking;cnt;Booking hotels