Repeating image of four skulls with increasing doubling, blurring, ghosting, pixelation, and horizontal glitching.

Matt Web thinks that countries need to be thinking about building a ‘strategic fact reserve’. It’s an interesting proposition but also… how has it come to this?!

[I]f I were to rank AI (not today’s AI but once it is fully developed and integrated) I’d say it’s probably not as critical as infrastructure or capacity as energy, food or an education system.

But probably it’s probably on par with GPS. Which underpins everything from logistics to automating train announcements to retail.

[…]

I think we’re all assuming that the Internet Archive will remain available as raw feedstock, that Wikipedia will remain as a trusted source of facts to steer it; that there won’t be a shift in copyright law that makes it impossible to mulch books into matrices, and that governments will allow all of this data to cross borders once AI becomes part of national security.

Everything I’ve said is super low likelihood, but the difficulty with training data is that you can’t spend your way out of the problem in the future. The time to prepare is now.

[…]

Probably the best way to start is to take a snapshot of the internet and keep it somewhere really safe. We can sift through it later; the world’s data will never be more available or less contaminated than it is today. Like when GitHub stored all public code in an Arctic vault (02/02/2020): a very-long-term archival facility 250 meters deep in the permafrost of an Arctic mountain. Or the Svalbard Global Seed Vault.

But actually I think this is a job for librarians and archivists.

Source: Interconnected

Image: Kathryn Conrad