MaxMind GeoLite2 infinte download attempts #2021

jonathandhn · 2024-02-22T15:58:19Z

Shlink version

3.7.3

PHP version

8.2

How do you serve Shlink

Docker image

Database engine

MariaDB

Database version

10.3.23

Current behavior

Hi, I did start a container on January the 6th (root-less, on 3.7.3 form c70cf1b37087581cfcb7963d74d6c13fbee8555a7b10aa4af0493e70ade41202 docker image) and it did the job well until the MaxMind monthly renewal on February the 9

Here is the logs from January the 6th until successful download of the initial GeoIP database


 [ERROR] Error generating database.. Set SHELL_VERBOSITY=3 to see specific error
         info.                                                                  

Updating database... Success!
Generating proxies...

 [ERROR] Error generating proxies.. Set SHELL_VERBOSITY=3 to see specific error 
         info.                                                                  

Clearing entities cache... Success!
Initializing database if needed... Success!
Updating database... Success!
Generating proxies... Success!
Clearing entities cache... Success!
Configuring periodic visit location...
{"level":"debug","ts":1704569056839172230,"logger":"rpc","msg":"plugin was started","address":"tcp://127.0.0.1:6001","list of the plugins with RPC methods:":["informer","resetter","app","jobs","lock"]}
2024-01-06T19:24:16+0000	DEBUG	jobs        	initializing driver	{"pipeline": "shlink", "driver": "memory"}
2024-01-06T19:24:16+0000	DEBUG	jobs        	driver ready	{"pipeline": "shlink", "driver": "memory", "start": "2024-01-06T19:24:16+0000", "elapsed": "247.6µs"}
{"level":"debug","ts":1704569056839702129,"logger":"memory","msg":"pipeline was started","driver":"memory","pipeline":"shlink","start":"2024-01-06 19:24:16.839698029 +0000 UTC","elapsed":"13.7µs"}
2024-01-06T19:24:18+0000	DEBUG	jobs        	exited from jobs pipeline processor
2024-01-06T19:24:18+0000	DEBUG	jobs        	exited from jobs pipeline processor
2024-01-06T19:24:18+0000	DEBUG	jobs        	exited from jobs pipeline processor
2024-01-06T19:24:18+0000	DEBUG	jobs        	exited from jobs pipeline processor
2024-01-06T19:24:18+0000	DEBUG	jobs        	exited from jobs pipeline processor
2024-01-06T19:24:18+0000	DEBUG	jobs        	exited from jobs pipeline processor
2024-01-06T19:24:18+0000	DEBUG	jobs        	exited from jobs pipeline processor
2024-01-06T19:24:19+0000	DEBUG	jobs        	exited from jobs pipeline processor
2024-01-06T19:24:19+0000	DEBUG	jobs        	exited from jobs pipeline processor
2024-01-06T19:24:19+0000	DEBUG	jobs        	exited from jobs pipeline processor
[INFO] RoadRunner server started; version: 2023.3.8, buildtime: 2023-12-14T16:05:30+0000
[INFO] sdnotify: not notified
2024-01-06T19:24:36+0000	INFO	server      	[2024-01-06T19:24:36.735201+00:00] [6a10f477-9e03-4ae0-89f6-f7551c1269e4] Access.INFO - OPTIONS /rest/v3/mercure-info 204 0
2024-01-06T19:24:36+0000	INFO	server      	[2024-01-06T19:24:36.810120+00:00] [7ae7666f-e550-4532-bcf7-b7c04331a2b4] Access.INFO - OPTIONS /rest/v3/visits 204 0
2024-01-06T19:24:36+0000	INFO	server      	[2024-01-06T19:24:36.830698+00:00] [755bf4cb-bc0b-4af1-8f64-ba6c2d2ff663] Access.INFO - OPTIONS /rest/v3/short-urls?itemsPerPage=5&orderBy=dateCreated-DESC 204 0
2024-01-06T19:24:36+0000	INFO	server      	[2024-01-06T19:24:36.962520+00:00] [5a4f04a1-3acf-42a4-8f98-493cb768f61e] Shlink.WARNING - This Shlink instance is not integrated with a mercure hub.

2024-01-06T19:24:36+0000	INFO	server      	[2024-01-06T19:24:36.963031+00:00] [5a4f04a1-3acf-42a4-8f98-493cb768f61e] Access.INFO - GET /rest/v3/mercure-info 501 189
2024-01-06T19:24:37+0000	INFO	server      	[2024-01-06T19:24:37.130192+00:00] [9c573b4a-9854-43b5-9217-e3bf7f0240bc] Access.INFO - GET /rest/v3/short-urls?itemsPerPage=5&orderBy=dateCreated-DESC 200 2482
2024-01-06T19:24:37+0000	INFO	server      	[2024-01-06T19:24:37.563617+00:00] [46427426-90a7-4b6a-a4f1-13f88a480436] Access.INFO - GET /rest/v3/visits 200 184
2024-01-06T19:30:56+0000	INFO	server      	[2024-01-06T19:30:56.084975+00:00] [dd93b595-76e5-40f2-aeef-1044b8a5dd66] Shlink.WARNING - Tried to locate visit with id "385438", but a GeoLite2 db was not found.

2024-01-06T19:30:56+0000	DEBUG	jobs        	job was pushed successfully	{"ID": "fb7ce40d-42d4-489d-a950-6e423f72f08f", "pipeline": "shlink", "driver": "memory", "start": "2024-01-06T19:30:56+0000", "elapsed": "94.4µs"}
{"level":"debug","ts":1704569456096737015,"logger":"memory","msg":"message pushed to the priority queue","current":1,"limit":10}
2024-01-06T19:30:56+0000	DEBUG	jobs        	job processing was started	{"ID": "fb7ce40d-42d4-489d-a950-6e423f72f08f", "start": "2024-01-06T19:24:19+0000", "elapsed": "6m36.701931243s"}
2024-01-06T19:30:56+0000	INFO	server      	[2024-01-06T19:30:56.129144+00:00] [dd93b595-76e5-40f2-aeef-1044b8a5dd66] Access.INFO - GET / 301 0
2024-01-06T19:30:56+0000	INFO	server      	[2024-01-06T19:30:56.149061+00:00] [NULL] Shlink.NOTICE - Downloading GeoLite2 db file...

2024-01-06T19:30:56+0000	INFO	server      	[2024-01-06T19:30:56.154348+00:00] [NULL] Shlink.NOTICE - Finished downloading GeoLite2 db file

and then the next relevant logs are :

2024-02-09T22:31:44+0000	INFO	server      	[2024-02-09T22:31:44.172486+00:00] [NULL] Shlink.NOTICE - Updating GeoLite2 db file...

2024-02-09T22:31:44+0000	INFO	server      	[2024-02-09T22:31:44.179021+00:00] [ca951555-db4b-44b3-99b2-47a7720070c2] Access.INFO - GET / 301 0
2024-02-09T22:31:44+0000	INFO	server      	[2024-02-09T22:31:44.181476+00:00] [NULL] Shlink.NOTICE - Finished updating GeoLite2 db file

2024-02-09T22:31:47+0000	DEBUG	jobs        	job was processed successfully	{"ID": "bf768829-d627-43f9-8f7f-d7efdf44ba2e", "start": "2024-02-09T17:33:28+0000", "elapsed": "4h58m18.467084645s"}
2024-02-09T22:34:12+0000	DEBUG	jobs        	job was pushed successfully	{"ID": "f3762ba1-c46f-4e3c-8fcc-b5b7daf45456", "pipeline": "shlink", "driver": "memory", "start": "2024-02-09T22:34:12+0000", "elapsed": "201.3µs"}
{"level":"debug","ts":1707518052939322575,"logger":"memory","msg":"message pushed to the priority queue","current":1,"limit":10}
2024-02-09T22:34:12+0000	DEBUG	jobs        	job processing was started	{"ID": "f3762ba1-c46f-4e3c-8fcc-b5b7daf45456", "start": "2024-02-09T17:33:37+0000", "elapsed": "5h0m35.303667944s"}
2024-02-09T22:34:12+0000	INFO	server      	[2024-02-09T22:34:12.940339+00:00] [NULL] Shlink.NOTICE - Updating GeoLite2 db file...

2024-02-09T22:34:12+0000	INFO	server      	[2024-02-09T22:34:12.942929+00:00] [89052956-93f0-42d2-9f5c-e4438cf57ffb] Access.INFO - GET / 301 0
2024-02-09T22:34:12+0000	INFO	server      	[2024-02-09T22:34:12.945200+00:00] [NULL] Shlink.NOTICE - Finished updating GeoLite2 db file

We are over 2000 downloads a day of GeoLite2-City prior to receiving a warning form MaxMind as early as 5:00 in the morning :

Daily GeoIP Database Download Limit Reached

From https://www.maxmind.com/ history for the very first occurrence :

Filename	Date	IP Address	City	Region	Country	ISP	Org
GeoLite2-City_20240209.tar.gz	2024-02-09 22:41:58	20.74.17.xxx	Paris	Paris	France	Microsoft Azure	Microsoft Azure
GeoLite2-City_20240209.tar.gz	2024-02-09 22:41:54	20.74.17.xxx	Paris	Paris	France	Microsoft Azure	Microsoft Azure
GeoLite2-City_20240209.tar.gz	2024-02-09 22:34:48	20.74.17.xxx	Paris	Paris	France	Microsoft Azure	Microsoft Azure
GeoLite2-City_20240209.tar.gz	2024-02-09 22:34:13	20.74.17.xxx	Paris	Paris	France	Microsoft Azure	Microsoft Azure
GeoLite2-City_20240209.tar.gz	2024-02-09 22:31:44	20.74.17.xxx	Paris	Paris	France	Microsoft Azure	Microsoft Azure
GeoLite2-City_20240209.tar.gz	2024-02-09 22:00:19	20.74.17.xxx	Paris	Paris	France	Microsoft Azure	Microsoft Azure
GeoLite2-City_20240105.tar.gz	2024-01-06 19:30:56	20.74.17.xxx	Paris	Paris	France	Microsoft Azure	Microsoft Azure

NB : 22 of February 16:54 Paris time, I will not restart the container until 23:00 Paris time if you need more logs and data, as I will avoid the bug over locking out our MaxMind account for February 23rd and so stop and spine a new container before midnight.

Expected behavior

Succeed on success download of MaxMind Db

How to reproduce

Running the container with MaxMind Setup over 30 days.

The text was updated successfully, but these errors were encountered:

acelaya · 2024-02-22T16:36:27Z

Can you try opening a shell inside the container and running shlink visit:download-db -vvv?

jonathandhn · 2024-02-22T17:04:38Z

Can you try opening a shell inside the container and running shlink visit:download-db -vvv?

docker exec fierte.pm shlink visit:download-db -vvv

 [INFO] GeoLite2 db file is up to date.

jonathandhn · 2024-02-22T17:16:59Z

Actually, the loop stopped today, the last record is from this afternoon

Filename	Date	IP Address	City	Region	Country	ISP	Org
GeoLite2-City_20240220.tar.gz	2024-02-22 15:07:27	20.74.17.xxx	Paris	Paris	France	Microsoft Azure	Microsoft Azure

acelaya · 2024-02-22T17:54:33Z

Sometimes there are issues with some of their files, in which the metadata used by Shlink to determine if an update is needed, is wrong.

It's probably what happened this time, which probably caused your instance to think an update was needed on every new request, and just re-download the same file again and again.

I tried to update the file in my own instance, but it skipped that version, and the most recent one seems to be fine.

I'm going to try to verify if this is the case.

jonathandhn · 2024-02-22T17:59:46Z

Thanks you, because they move us from 1000 downloads a day to 30 on free plan, so we must be careful.

acelaya · 2024-02-22T18:05:35Z

Thanks you, because they move us from 1000 downloads a day to 30 on free plan, so we must be careful.

Ouch! Do you have some link where this is explained? I would like to reference it from the docs.

If I manage to confirm this was the problem, I'll try to find some way to mitigate it.

jonathandhn · 2024-02-22T18:10:29Z

Thanks you, because they move us from 1000 downloads a day to 30 on free plan, so we must be careful.

Ouch! Do you have some link where this is explained? I would like to reference it from the docs.

If I manage to confirm this was the problem, I'll try to find some way to mitigate it.

Here : https://comms.maxmind.com/daily-download-limit-decreasing-2

acelaya · 2024-02-22T18:35:34Z

Just checked the file from the 9th of February, and the metadata is correct. Shlink should not have tried to download it over and over.

The logic basically compares the GeoLite file's build time and checks if it's more than 35 days old, in which case it tries to download a new copy.

This is done with concurrency in mind, so a lock is set until download ends, to avoid multiple downloads in parallel.

Other potential reasons for this to happen are that there was not enough disk space to decompress the file after downloading it, or perhaps an issue with the system date that made Shlink think it was in the future.

I'll keep this open for now to see if I can think in some way to make the process more resilient.

sparanoid · 2024-03-03T06:31:15Z

Got the same bug last week and I also received download limit reached notification from MaxMind.

Other potential reasons for this to happen are that there was not enough disk space to decompress the file after downloading it

In my case the server has enough disk space to handle the file.

I've restart the shlink service to see if it will work.

acelaya · 2024-03-04T21:12:21Z

Could any of you check if your instances have some log entry starting with GeoLite2 database download failed?

sparanoid · 2024-03-08T08:07:12Z

Here's the log related to GeoLite2 database download failed

shlink.log

acelaya · 2024-03-08T08:21:02Z

Here's the log related to GeoLite2 database download failed

shlink.log

Yeah, that's basically showing that Shlink successfully downloaded new versions of the database on every visit, until it reached the API limit, and then all the instances of GeoLite2 database download failed are due to that limit and Shlink still attempting to download every time.

Unfortunately it does not explain why Shlink was still thinking a new instance was needed to be downloaded, when it had a fresh copy.

The only solution I can think of is to change how Shlink decides when a new copy is needed. Potential options:

Track when was the last successful download, and attempt new downloads after a period of time.
This is tricky though, as it needs to be tracked per instance, not on a central place like the database.
There are also many edge cases that would need to be handled.
Query the API to see if there's a new version available, and download it only in that case.
I don't even know if this is supported, or if those "queries" would also count for the download limit somehow, driving this useless.
Also, this might add extra latency.

For context, the way it works now is that Shlink reads the database metadata, for a value that tells when was it build. If a certain a mount of days has passed (35, if I remember correctly), or the database does not exist at all, it tries to download it.

It is very straightforward, has very low impact and keeps the GeoLite file as the single source of truth, which is convenient, but it is clearly not covering some particular scenario that I'm missing.

acelaya · 2024-03-08T22:15:00Z

There was a new report of this issue, but in there, it was mentioned this was happening with orphan visits specifically.

I checked again the log provided here, and I noticed there are many attempts on downloading the database as a result of an orphan visit.

I also see some attempts which do not seem to be linked with a particular request happening instants before it, though. @sparanoid could it be that you have some scheduled task to periodically download the GeoLite file, or that the logs were manipulated to remove sensitive information?

oschwald · 2024-03-08T23:05:18Z

I haven't looked too closely at the code, but it appears that you are downloading the file to a temporary file and then copying it to the final location. This could potentially result in a corrupted file if multiple requests are going at once. To prevent this, you could either write the file atomically or take out appropriate locks (or preferably both).

In order to write the file atomically, you should download it to the same directory as the final file to ensure the file is on the same file system, decompress it, and then rename the file to the final file name. You would either want to take a lock to ensure that no other request is writing to the same temporary files at the same time or you would want to use random names for the temporary files.

Some other thoughts:

Comparing the metadata time and the system time could result in excess downloads if the system time is off.
What happens if the file system is read-only or open_basedir is enabled and the database path is outside of it?

Edit:

I was looking at the code in shlink-ip-geolocation when I commented above, and missed this code in this repo:

shlink/module/CLI/src/GeoLite/GeolocationDbUpdater.php

Lines 41 to 49 in e244b2d

    
               $lock = $this->locker->createLock(self::LOCK_NAME); 
        
               $lock->acquire(true); // Block until lock is released 
        
               try { 
        
                   return $this->downloadIfNeeded($beforeDownload, $handleProgress); 
        
               } finally { 
        
                   $lock->release(); 
        
               } 
        
           }

I didn't look into how that locking works, but presumably it prevents multiple downloads at once.

SoCuul · 2024-03-09T03:32:03Z

I'm having the same problem with my instance, which just started happening in the last few days.

acelaya · 2024-03-09T05:26:15Z

I was looking at the code in shlink-ip-geolocation when I commented above, and missed this code in this repo:

shlink/module/CLI/src/GeoLite/GeolocationDbUpdater.php

Lines 41 to 49 in e244b2d

$lock = $this->locker->createLock(self::LOCK_NAME);

$lock->acquire(true); // Block until lock is released

try {

return $this->downloadIfNeeded($beforeDownload, $handleProgress);

} finally {

$lock->release();

}

}

I didn't look into how that locking works, but presumably it prevents multiple downloads at once.

Yes, that's correct. That lock prevents multiple downloads in parallel.

acelaya · 2024-03-09T05:28:56Z

I have a suspicion of what could be the problem. There might be some stateful service somewhere down the dependency tree, that's keeping a reference to the old database file metadata, making every check resolve that the file is too old, resulting in a new download.

acelaya · 2024-03-09T07:23:13Z

@oschwald answering to your comments:

In order to write the file atomically, you should download it to the same directory as the final file to ensure the file is on the same file system, decompress it, and then rename the file to the final file name. You would either want to take a lock to ensure that no other request is writing to the same temporary files at the same time

This is exactly how it's done.

Comparing the metadata time and the system time could result in excess downloads if the system time is off.

I thought about this, but it would have to be several days off, so I think it's a negligible risk.

If someone really has a system with a so messed up system time, I think it's reasonable that the expected solution in that case is to ask the admins to fix that, not to expect Shlink to work around the problem.

Ultimately, any solution that does not make a lot of MaxMind API requests would be time based, one way or another, so there's not much that can be done here.

What happens if the file system is read-only or open_basedir is enabled and the database path is outside of it?

Then nothing can be done and GeoLite files won't be downloaded. It's an unfortunate limitation due to how GeoLite db files work.

In any case, this already happened not long ago. The solution involved making sure Shlink only tries to write on its own data directory, and incidentally, on the tmp dir due to some external dirs.

acelaya · 2024-03-09T07:30:30Z

I have a suspicion of what could be the problem. There might be some stateful service somewhere down the dependency tree, that's keeping a reference to the old database file metadata, making every check resolve that the file is too old, resulting in a new download.

I can confirm this is the problem. There's an unintentional stateful service that's reading the GeoLite file metadata when created, and holding it in memory, making every check think the database is too old.

This is affecting all versions of Shlink, so I will try to backport it to v3.x if it's not too complex.

acelaya · 2024-03-09T08:54:39Z

I have just released version 4.0.2 and 3.7.4, both including the fix for this bug.

jonathandhn added the bug label Feb 22, 2024

github-project-automation bot added this to Shlink Feb 22, 2024

acelaya added the help wanted label Mar 8, 2024

acelaya pinned this issue Mar 8, 2024

acelaya changed the title ~~MaxMind GeoIP download bug~~ MaxMind GeoLite2 infinte download attempts Mar 8, 2024

acelaya mentioned this issue Mar 8, 2024

Repeated downloads of the GeoIP database #2048

Closed

This was referenced Mar 9, 2024

Provide a GeoLite2ReaderFactory service shlinkio/shlink-ip-geolocation#59

Closed

Create a factory service that can create GeoIp2\Database\Reader instances at runtime shlinkio/shlink-ip-geolocation#60

Merged

acelaya added this to the 4.0.2 milestone Mar 9, 2024

acelaya moved this to In Progress in Shlink Mar 9, 2024

acelaya removed this from the 4.0.2 milestone Mar 9, 2024

acelaya modified the milestones: 4.1.0, 4.0.2 Mar 9, 2024

acelaya removed the help wanted label Mar 9, 2024

acelaya mentioned this issue Mar 9, 2024

Fix infinite GeoLite2 downloads #2051

Merged

acelaya closed this as completed in #2051 Mar 9, 2024

github-project-automation bot moved this from In Progress to Done in Shlink Mar 9, 2024

acelaya mentioned this issue Mar 9, 2024

Fix infinite GeoLite2 downloads - v3.x backport #2052

Merged

acelaya unpinned this issue Mar 9, 2024

oliversturm mentioned this issue Apr 29, 2024

MaxMind GeoLite2 infinite download attempts - problem still exists with 4.0.2 #2114

Closed

acelaya mentioned this issue May 10, 2024

Improve GeoLite2 file downloads when using RoadRunner #2124

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MaxMind GeoLite2 infinte download attempts #2021

MaxMind GeoLite2 infinte download attempts #2021

jonathandhn commented Feb 22, 2024 •

edited

Loading

acelaya commented Feb 22, 2024

jonathandhn commented Feb 22, 2024 •

edited

Loading

jonathandhn commented Feb 22, 2024

acelaya commented Feb 22, 2024

jonathandhn commented Feb 22, 2024

acelaya commented Feb 22, 2024 •

edited

Loading

jonathandhn commented Feb 22, 2024

acelaya commented Feb 22, 2024

sparanoid commented Mar 3, 2024

acelaya commented Mar 4, 2024

sparanoid commented Mar 8, 2024

acelaya commented Mar 8, 2024 •

edited

Loading

acelaya commented Mar 8, 2024

oschwald commented Mar 8, 2024 •

edited

Loading

SoCuul commented Mar 9, 2024

acelaya commented Mar 9, 2024

acelaya commented Mar 9, 2024

acelaya commented Mar 9, 2024

acelaya commented Mar 9, 2024 •

edited

Loading

acelaya commented Mar 9, 2024

MaxMind GeoLite2 infinte download attempts #2021

MaxMind GeoLite2 infinte download attempts #2021

Comments

jonathandhn commented Feb 22, 2024 • edited Loading

Shlink version

PHP version

How do you serve Shlink

Database engine

Database version

Current behavior

Expected behavior

How to reproduce

acelaya commented Feb 22, 2024

jonathandhn commented Feb 22, 2024 • edited Loading

jonathandhn commented Feb 22, 2024

acelaya commented Feb 22, 2024

jonathandhn commented Feb 22, 2024

acelaya commented Feb 22, 2024 • edited Loading

jonathandhn commented Feb 22, 2024

acelaya commented Feb 22, 2024

sparanoid commented Mar 3, 2024

acelaya commented Mar 4, 2024

sparanoid commented Mar 8, 2024

acelaya commented Mar 8, 2024 • edited Loading

acelaya commented Mar 8, 2024

oschwald commented Mar 8, 2024 • edited Loading

SoCuul commented Mar 9, 2024

acelaya commented Mar 9, 2024

acelaya commented Mar 9, 2024

acelaya commented Mar 9, 2024

acelaya commented Mar 9, 2024 • edited Loading

acelaya commented Mar 9, 2024

jonathandhn commented Feb 22, 2024 •

edited

Loading

jonathandhn commented Feb 22, 2024 •

edited

Loading

acelaya commented Feb 22, 2024 •

edited

Loading

acelaya commented Mar 8, 2024 •

edited

Loading

oschwald commented Mar 8, 2024 •

edited

Loading

acelaya commented Mar 9, 2024 •

edited

Loading