- WikiTaxi is an app that lets you download Wikipedia’s database to your computer which you can view, search, and browse offline. According to the project page, it’s a “single-file application” that “does not require a database engine or HTML browser.” So how does WikiTaxi do it? The app uses Wikipedia’s original database dumps.
People also ask
Is there an app to download wikipedia offline?
Which is the Best offline reader for Wikipedia?
Where can I download Wikipedia on my computer?
Which is the best free version of Wikipedia?
Dec 07, 2016 · Using XOWA is the easiest way to download Wikipedia for offline use, by far. It requires zero configuration and aside from download time, you can have it up and running in a matter of minutes. There are other ways obviously, but they’re not for the faint of heart.
Kiwix, Wikipedia offline. The whole of Wikipedia on your device! The app is a lightweight piece of software reading bigger files stored on your device or SD card: once it is installed, you can select which additional content you would like to download (Wikipedia, Wiktionary, TED talks, etc.) and be ready for when your internet connection is bad ...
Currently the only way to update your offline version of Wikipedia is to download an entirely new package. Okay, so once you have decided which package you want to try for, you are ready to travel to the repository where all the content and software is stored.
- Offline Wikipedia Readers
- Where Do I Get It?
- Should I Get Multistream?
- Where Are The uploaded Files (image, Audio, Video, etc.)?
- Dealing with Compressed Files
- Dealing with Large Files
- Why Not Just Retrieve Data from Wikipedia.Org at runtime?
- Database Schema
- Help to Parse Dumps For Use in Scripts
- Static Html Tree Dumps For Mirroring Or CD Distribution
Some of the many ways to read Wikipedia while offline: 1. XOWA: (§ XOWA) 2. Kiwix: (§ Kiwix) 3. WikiTaxi: § WikiTaxi (for Windows) 4. aarddict: § Aard Dictionary 5. BzReader: § BzReader and MzReader (for Windows) 6. Selected Wikipedia articles as a PDF, OpenDocument, etc.: Wikipedia:Books 7. Selected Wikipedia articles as a printed book: Help:Books/Printed books 8. Wiki as E-Book: § E-book 9. WikiFilter: § WikiFilter 10. Wikipedia on rockbox: § Wikiviewer for Rockbox Some of them are mobile applications -- see "list of Wikipedia mobile applications".
1. Dumps from any Wikimedia Foundation project: dumps.wikimedia.org and the Internet Archive 2. English Wikipedia dumps in SQL and XML: dumps.wikimedia.org/enwiki/ and the Internet Archive 2.1. Downloadthe data dump using a BitTorrent client (torrenting has many benefits and reduces server load, saving bandwidth costs). 2.2. pages-articles-multistream.xml.bz2 – Current revisions only, no talk or user pages; this is probably what you want, and is approximately 18 GB compressed (expands to over...
TL;DR:GET THE MULTISTREAM VERSION! (and the corresponding index file, pages-articles-multistream-index.txt.bz2) pages-articles.xml.bz2 and pages-articles-multistream.xml.bz2 both contain the same xml contents. So if you unpack either, you get the same data. But with multistream, it is possible to get an article from the archive without unpacking the whole thing. Your reader should handle this for you, if your reader doesn't support it it will work anyway since multistream and non-multistream contain the same xml. The only downside to multistream is that it is marginally larger. You might be tempted to get the smaller non-multistream archive, but this will be useless if you don't unpack it. And it will unpack to ~5-10 times its original size. Penny wise, pound stupid. Get multistream. NOTE THAT the multistream dump file contains multiple bz2 'streams' (bz2 header, body, footer) concatenated together into one file, in contrast to the vanilla file which contains one stream. Each separa...
Images and other uploaded media are available from mirrors in addition to being served directly from Wikimedia servers. Bulk download is (as of September 2013) available from mirrors but not offered directly from Wikimedia servers. See the list of current mirrors. You should rsync from the mirror, then fill in the missing images from upload.wikimedia.org; when downloading from upload.wikimedia.org you should throttle yourself to 1 cache miss per second (you can check headers on a response to see if was a hit or miss and then back off when you get a miss) and you shouldn't use more than one or two simultaneous HTTP connections. In any case, make sure you have an accurate user agent string with contact info (email address) so ops can contact you if there's an issue. You should be getting checksums from the mediawiki API and verifying them. The API Etiquette page contains some guidelines, although not all of them apply (for example, because upload.wikimedia.org isn't MediaWiki, there i...
Compressed dump files are significantly compressed, thus after being decompressed will take up large amounts of drive space. A large list of decompression programs are described in Comparison of file archivers. The following programs in particular can be used to decompress bzip2 .bz2 .zip and .7zfiles. Windows Beginning with Windows XP, a basic decompression program enables decompression of zip files.Among others, the following can be used to decompress bzip2 files. 1. bzip2 (command-line) (from here) is available for free under a BSD license. 2. 7-Zip is available for free under an LGPLlicense. 3. WinRAR 4. WinZip Macintosh(Mac) 1. OS Xships with the command-line bzip2 tool. GNU/Linux 1. Most GNU/Linux distributions ship with the command-line bzip2 tool. Berkeley Software Distribution(BSD) 1. Some BSD systems ship with the command-line bzip2 tool as part of the operating system. Others, such as OpenBSD, provide it as a package which must first be installed. Notes 1. Some older vers...
As files grow in size, so does the likelihood they will exceed some limit of a computing device. Each operating system, file system, hard storage device, and software (application) has a maximum file size limit. Each one of these will likely have a different maximum, and the lowest limit of all of them will become the file size limit for a storage device. The older the software in a computing device, the more likely it will have a 2 GB file limit somewhere in the system. This is due to older software using 32-bit integers for file indexing, which limits file sizes to 2^31 bytes (2 GB) (for signed integers), or 2^32 (4 GB) (for unsigned integers). Older C programming libraries have this 2 or 4 GB limit, but the newer file libraries have been converted to 64-bit integers thus supporting file sizes up to 2^63 or 2^64 bytes (8 or 16 EB). Before starting a download of a large file, check the storage device to ensure its file system can support files of such a large size, and check the am...
Suppose you are building a piece of software that at certain points displays information that came from Wikipedia. If you want your program to display the information in a different way than can be seen in the live version, you'll probably need the wikicode that is used to enter it, instead of the finished HTML. Also, if you want to get all the data, you'll probably want to transfer it in the most efficient way that's possible. The wikipedia.org servers need to do quite a bit of work to convert the wikicode into HTML. That's time consuming both for you and for the wikipedia.org servers, so simply spidering all pages is not the way to go. To access any article in XML, one at a time, access Special:Export/Title of the article. Read more about this at Special:Export. Please be aware that live mirrors of Wikipedia that are dynamically loaded from the Wikimedia servers are prohibited. Please see Wikipedia:Mirrors and forks.
See also: mw:Manual:Database layout The sql file used to initialize a MediaWiki database can be found here.
The XML schema for each dump is defined at the top of the file. And also described in the MediaWiki export help page.Wikipedia:Computer help desk/ParseMediaWikiDump describes the PerlParse::MediaWikiDump library, which can parse XML dumps.Wikipedia preprocessor (wikiprep.pl) is a Perlscript that preprocesses raw XML dumps and builds link tables, category hierarchies, collects anchor text for each article etc.
MediaWiki 1.5 includes routines to dump a wiki to HTML, rendering the HTML with the same parser used on a live wiki. As the following page states, putting one of these dumps on the web unmodified will constitute a trademark violation. They are intended for private viewing in an intranet or desktop installation. 1. If you want to draft a traditional website in Mediawiki and dump it to HTML format, you might want to try mw2html by User:Connelly. 2. If you'd like to help develop dump-to-static HTML tools, please drop us a note on the developers' mailing list. 3. Static HTML dumps are now available here, but are not current. See also: 1. mw:Alternative parserslists some other not working options for getting static HTML dumps 2. Wikipedia:Snapshots 3. Wikipedia:TomeRaider database
Apr 23, 2013 · Method #1: Kiwix Kiwix is an offline reader that allows you to download the entire Wikipedia library (over 9 gigabytes) as seen in January 2012. Since that's a lot of content, there are no photos included.
- Osas Obaiza
- Free Applications
- Applications to Buy
- Other Source of Wikipedia
A few FREE applications and software programs have been listed below, along with brief descriptions for your convenience. 1. 1WikiTaxi. This software will extract the Wikipedia database files (.bz2) right before converting them to a ".taxi" format. You can now easily read these pages right off WikiTaxi like you'd read paperback books on the bus ride to work!Was this step helpful? Yes | No| I need help 2. 2Kiwix. This software has been made available for Windows, Mac and Linux systems. All you have to do is install the application, and download Wikipedia editions which are packed as ".zim" files that will open without any glitches or conversions required through Kiwix. Note that the latest packages in ".zim" format can be downloaded from kiwix.org or you can download torrent files, instead of Kiwix. This app will allow you to read Wikipedia when you're offline. Kiwix is used in Mozilla. It is designed specifically for Wikipedia but it is also allowed for Wikimedia. Kiwix for Wikipedi...
Of course, if you are willing to spend a little, here are some other applications that are undoubtedly worth it! 1. 1All of Wikipedia - this software is available for all iOS devices. It has more features and will alert you whenever new Wikipedia database dumps are available to download in a fast and easy way! Last checked cost: $9.99. All of Wikipedia is an app where you can download the largest encyclopedia Wikipedia. You can access this app in iOS devices. Your download of Wikipedia will take 3 to 4 hours, depending on the speed of your internet connection. Was this step helpful? Yes | No| I need help 2. 2WikiPock - This particular software is no less efficient than the others. It has been made available for all kinds of platforms, including Android, Windows Mobile, Blackberry and Symbian devices. An added bonus is they will ship you a pre-loaded memory card with Wikipedia on it if you find you don't have enough space for it on your device. Last checked cost: $9.99. WikiPock brin...1Wiki Offline. You can change the theme and fonts of the app. This is a Mac application. It allows image galleries and full-screen reading. Was this step helpful? Yes | No| I need help2WikiDroyd. This turns Wikipedia articles into audio book through the use of a text-to-speech plugin. It supports up to over 40 languages. Images are not supported with this Android app. Was this s...3WikiReader contains 3 million Wikipedia articles. Here you can convert Wikipedia into PDF book. This is not an app but a device. A 3.5 " monochrome device. You get updates through SD cards or conn...4Lexium. This provides you a summary of the article content instead of the full Wikipedia content. You can print it by entering the code in the box. You can use it in various languages. Was this st...
Kiwix is an offline reader for online content like Wikipedia, Project Gutenberg, or TED Talks. It makes knowledge available to people with no or limited internet access. The software as well as the content is free to use for anyone. Learn more