-
-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
request for a smaller version of the combined.json #17
Comments
One of the primary motivations for this project is to have a very high level of accuracy of the boundaries. One can see a big improvement when comparing tz_world to timezone-boundary-builder. Thus, I am very hesitant to do any simplifications of the boundaries. With respect to various libraries that implement a lookup based off of this data, it does seem that each of them have their own compression methodologies. Some of them do go so far as simplifying geometries. In issue #11 a user mentioned that a lot of coordinates are excessively precise. I think reducing the precision to 5 or 6 decimal places could have a good effect on reducing the file size and is something that I can commit to pursuing. |
One would need do some calculations on how many decimal places are required for a certain accuracy, but my actual suggestion is compiling a completely separate .json file with reduced size (while keeping the bigger more accurate one). |
As noted in my previous message, I believe that is a task best left to downstream users of the output data. |
@jannikmi I am working on a project that needs more performance than accuracy as the timezones are stored in a database and the originals are very heavy. The process involves reducing to 4 decimal places the coordinates (10 meters error is acceptable) https://gis.stackexchange.com/a/8674 and applying simplification methods in the database for insertion as a polygon https://gis.stackexchange.com/a/428927 |
Thanks for referencing this here. Perhaps it would be helpful for other users when you make your data compressions code available once your done! |
@jannikmi I'm using PHP and MySQL. PostgreSQL uses ST_SimplifyPolygonHull and MySQL ST_Simplify. My Code is similar to: use Illuminate\Support\Facades\DB;
/**
* @param \stdClass $geometry
* @param float $simplify = 0
*
* @return \Illuminate\Database\Query\Expression
*/
function geomFromGeoJSON(stdClass $geometry, float $simplify = 0): Expression
{
if ($geometry->type !== 'MultiPolygon') {
$geometry->type = 'MultiPolygon';
$geometry->coordinates = [$geometry->coordinates];
}
$sql = sprintf("ST_GeomFromGeoJSON('%s', 2, 0)", json_encode($geometry));
$sql = 'ST_Simplify('.$sql.', '.$simplify.')';
return DB::raw($sql);
}
/**
* @param \stdClass $zone
*
* @return void
*/
function zoneSave(stdClass $zone): void
{
try {
zoneUpdateOrInsert($zone, 0.005);
} catch (Throwable $e) {
zoneUpdateOrInsert($zone, 0);
}
}
/**
* @param \stdClass $zone
* @param float $simplify
*
* @return void
*/
function zoneUpdateOrInsert(stdClass $zone, float $simplify): void
{
TimeZoneModel::updateOrInsert(
['zone' => $zone->properties->tzid],
['geojson' => geomFromGeoJSON($zone->geometry, $simplify)]
);
}
$json = file_get_contents('combined-with-oceans.json');
$json = preg_replace('/([0-9]\.[0-9]{4})[0-9]+/', '$1', $json);
foreach (json_decode($json)->features as $zone) {
zoneSave($zone);
} Regards, |
@jannikmi, as final comment, the size of |
@eusonlito i am interested in smaller size shape/geojson. I am looking at find a very small timezone geojson (only used in a weather app to get timezone from weather location). Can you explain a bit more how you simplified it? Did you clone this repo and modified some files? |
@farfromrefug I'm not worried about holes :) My code:
|
@eusonlito ok so you end up with something similar to mine :D |
For some applications it might not be feasible to have a 120MB .json file as data basis.
With certain simplifications and tricks it should theoretically be possible to compress the data to around 7MB (cf. timezonefinderL data, consisting of simplified tz_world data).
Without trying to go into those extremes the question is how reduce data size while still keeping an acceptable level of accuracy.
The text was updated successfully, but these errors were encountered: