Search results
Top results related to how to download wikipedia dump files without a wiki article
Top Answer
Answered Apr 11, 2018 · 6 votes
You probably want to look at Kiwix which provides a complete offline Wikipedia experience with thumbnails. (~75 GiB)
The reason there isn't any tarball because its huge. The "Fair Use" media (low resolution copyrighted images such as poster, album art, etc.) for English Wikipedia is 162 GiB. We also have a lot of media on Wikimedia Commons, 153 TiB, mostly unused.
Limiting to only what's necessary for English Wikipedia, you'll need to download 5.1 TiB from 4,525,268 non-multimedia files.
1/5
Top Answer
Answered Nov 27, 2020 · 2 votes
I used WinMerge for comparing these two standalone Chrome exes, version 86.0.4240.111 - 64 bits and 87.0.4280.66 - 64 bits, respectively ChromeStandaloneSetup64_28-10-2020_14.24.03_valid_Copy_old_version.exe (downloaded from https://dl.google.com/chrome/install/ChromeStandaloneSetup64.exe - broken exe aka missing the download URL tags within the exe) and ChromeStandaloneSetup64_valid_Copy_latest_version.exe (downloaded from https://www.google.com/chrome/?standalone=1&platform=win64 - valid exe aka exe that has the download URL tags within it).
As as a comparison, here's a tagged download URL:
https://dl.google.com/tag/s/appguid%3D%7B8A69D345-D564-463C-AFF1-A69D9E530F96%7D%26iid%3D%7B04A7785F-B8A2-B4AA-7A45-17861EB0DE70%7D%26lang%3Dro%26browser%3D4%26usagestats%3D1%26appname%3DGoogle%2520Chrome%26needsadmin%3Dprefers%26ap%3Dx64-stable-statsdef_1%26installdataindex%3Ddefaultbrowser/chrome/install/ChromeStandaloneSetup64.exe
and here is an untagged download URL:
https://dl.google.com/chrome/install/ChromeStandaloneSetup64.exe
While in WinMerge, I've searched for "&lang=ro&browser" in ChromeStandaloneSetup64_valid_Copy_latest_version.exe (the file to the right), then I went to Edit > Select Line Difference (F4), then to Merge > Copy to Left. Next, I went to File > Save As > Save Left As...
I then ran the resulting exe (patched exe, having version 86.0.4240.111 64 bits) in a fresh Windows 7 virtual machine (VM), and this time it installed correctly. For testing purposes, I turned off the network, and, when going to the 3 dotted menu in Chrome > Help > About Chrome, indeed, Chrome was the old version. After that, I turned on network and it had been updating to the latest version, 87.0.4280.66 64 bits, running with no problems.
Even though ChromeStandaloneSetup64_valid_Copy_latest_version.exe (valid exe) came with a Zone Identifier Alternate Data Stream (ADS), unlike the broken exe, the resulting exe coming out the broken exe, lacking ADS, it ran perfectly fine in the VM. Regarding ADS, I downloaded the same valid Chrome exe twice (87.0.4280.66 64 bits) and I noticed the ADS had the same exact value.
I also did additional tests and I've noticed if the resulting exe is lacking the appguid and iid fields, it won't install.
Another thing I did, it was messing around with the values in the respective fields, so I made the appguid and iid fields look like this:
appguid={00000000-0000-0000-0000-000000000000}&iid={00000000-0000-0000-0000-000000000000}
, which turned out to make the executable not install, getting Error code: 0x80070057.
Once I was done messing around with the exes editing, I started messing with the URL.
As for this given URL, coming from https://www.google.com/chrome/?standalone=1&platform=win64 (valid exe):
https://dl.google.com/tag/s/appguid%3D%7B8A69D345-D564-463C-AFF1-A69D9E530F96%7D%26iid%3D%7B04A7785F-B8A2-B4AA-7A45-17861EB0DE70%7D%26lang%3Dro%26browser%3D4%26usagestats%3D1%26appname%3DGoogle%2520Chrome%26needsadmin%3Dprefers%26ap%3Dx64-stable-statsdef_1%26installdataindex%3Ddefaultbrowser/chrome/install/ChromeStandaloneSetup64.exe
I trimmed down the appguid and iid, getting:
https://dl.google.com/tag/s/lang%3Dro%26browser%3D4%26usagestats%3D1%26appname%3DGoogle%2520Chrome%26needsadmin%3Dprefers%26ap%3Dx64-stable-statsdef_1%26installdataindex%3Ddefaultbrowser/chrome/install/ChromeStandaloneSetup64.exe
, which led to a broken exe, getting the same error code as above, Error code: 0x80070057.
Next thing I did was going to https://codebeautify.org/generate-random-data-from-regexp and generating a random appguid and iid (using hexadecimal values, because that's what I noticed it was being used in those fields) using this pattern:
[0-9A-F]{8}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{4}-[0-9A-F]{12}
, then using two different numbers:
038C42B4-AF87-AD38-990A-4384A7E29E04 - for appguid
EC7C4A9D-76CB-FBA7-D1C2-70EB689DA8F4 - for iid
, I got this link:
https://dl.google.com/tag/s/appguid%3D%7B038C42B4-AF87-AD38-990A-4384A7E29E04%7D%26iid%3D%7BEC7C4A9D-76CB-FBA7-D1C2-70EB689DA8F4%7D%26lang%3Dro%26browser%3D4%26usagestats%3D1%26appname%3DGoogle%2520Chrome%26needsadmin%3Dprefers%26ap%3Dx64-stable-statsdef_1%26installdataindex%3Ddefaultbrowser/chrome/install/ChromeStandaloneSetup64.exe
, which, during installation, got me "Unable to connect to the Internet. If you use a firewall, please whitelist GoogleUpdate.exe" error.
So, since the appguid and iid were coming from a later version of Chrome (appguid and iid taken from 87.0.4280.66 and inserted into 86.0.4240.111), I assume, installation will finish successfully only if the appguid and iid contained in the exe itself, through computation during instalation, will yield a result that matches a value stored somewhere in the exe, or a value that also gets computed during installation. I say this, because when I installed the resulting valid standalone Chrome exe coming from the broken exe, the network was turned off, and there was no prior installation of Chrome on that system. Or maybe those two values are somehow tied to the operating system.
I conclusion, I think I'll end up modifing the URLs in the batch file with those appguid and iid, unless I decide to take a closer look at the URLs of the upcoming releases of Chrome to get an even better idea of how all of this is working.
These:
8A69D345-D564-463C-AFF1-A69D9E530F96 - for appguid
04A7785F-B8A2-B4AA-7A45-17861EB0DE70 - for iid
2/5
Top Answer
Answered Mar 09, 2018 · 18 votes
There are a number of ways of triggering a download. Following are a few:
Use a form:
<form method="get" action="mydoc.doc"><button type="submit">Download</button></form>-
Use javascript:
<button type="submit" onclick="window.open('mydoc.doc')">Download</button>-
3/5
Top Answer
Answered May 28, 2014 · 28 votes
There are 2 places where I can see you could potentially be building up memory usage:
- In the buffer reading your input file.
- In the buffer writing to your output stream (HTTPOutputStream?)
For #1 I would suggest reading directly from the file via FileInputStream without the BufferedInputStream. Try this first and see if it resolves your issue. ie:
FileInputStream in = new FileInputStream(file); -
instead of:
BufferedInputStream in = new BufferedInputStream(new FileInputStream(file)); -
If #1 does not resolve the issue, you could try periodically flushing the output stream after so much data is written (decrease chunk size if necessary):
ie:
try{ FileInputStream fileInputStream = new FileInputStream(file); byte[] buf=new byte[8192]; int bytesread = 0, bytesBuffered = 0; while( (bytesread = fileInputStream.read( buf )) > -1 ) { out.write( buf, 0, bytesread ); bytesBuffered += bytesread; if (bytesBuffered > 1024 * 1024) { //flush after 1MB bytesBuffered = 0; out.flush(); } }}finally { if (out != null) { out.flush(); }}
4/5
Top Answer
Answered May 23, 2017 · 0 votes
Assuming that you cannot enable api access to your account (see https://stackoverflow.com/a/27120094/2213940 where it shows how to enable api access to just an organization within your domain) you could do the following: Using a browser debugger (I used chrome which opens using F12) go to your drive, and select one of the files you want to download. In the chrome debugger, view the "Network" tab. clear it to make things easier. Now download the file. If its a non-google file, just download it from there. You will see a post like "https://drive.google.com/uc?id=XXXXXXXXXX&authuser=0&export=download". You can use curl to do the same post, right-click the line and "copy as cURL" which will copy the cookie and such. that post gives you a download url. if its a google file (like a spreadsheet) you need to do something similar but it might only work if you do it by opening the spreadsheet first and using the menu to export as xlsx which will be the best backup format.those urls will look like "https://docs.google.com/spreadsheets/d/XXXXX/export?format=xlsx&id=XXXX Note that in both cases, the key is to include the same cookie that your browser is using. Ive used cookies like that (but for gmail not drive) and the cookie so far is lasting months and hasnt expired). Make sure to secure the scripts well as the cookie is for your own drive and can be used maliciously.
5/5
People also ask
How to download a Wikipedia dump file without visiting a wiki site?
- WikiFilter is a program which allows you to browse over 100 dump files without visiting a Wiki site. Start downloading a Wikipedia database dump file such as an English Wikipedia dump. It is best to use a download manager such as GetRight so you can resume downloading the file even if your computer crashes or is shut down during the download.
Wikipedia:Database download - Wikipedia
en.wikipedia.org/wiki/Wikipedia:Database_downloadWhat is a data dump in wikitaxi?
- Data dumps, in general, are outputs of data that are used as a backup. But it can also be used to replicate the database. This is essentially what you’ll be doing with WikiTaxi. To start getting your own offline Wikipedia, you’ll be downloading a Wikipedia database file and the WikiTaxi application from the Internet.
How To Download And View Wikipedia Offline - Investintech.com
www.investintech.com/resources/blog/archives/4198-view-download-wikipedia-offline.htmlWhat is a Wikimedia enterprise HTML dump?
- A copy of all pages from all Wikipedia wikis, in HTML form. These are currently not running, but Wikimedia Enterprise HTML dumps are provided for some wikis. Available for some Wikipedia editions. A complete copy of selected Wikimedia wikis which no longer exist and so which are no longer available via the main database backup dump page.
Wikimedia Downloads
dumps.wikimedia.org/What is a wiki dump?
- Dumps are produced for a specific set of namespaces and wikis, and then made available for public download. Each dump output file consists of a tar.gz archive which, when uncompressed and untarred, contains one file, with a single line per article, in json format. This is currently an experimental service.
Wikipedia:Database download - Wikipedia
en.wikipedia.org/wiki/Wikipedia:Database_downloaden.wikipedia.org › wiki › Wikipedia:Database_downloadWikipedia:Database download - Wikipedia
en.wikipedia.org › wiki › Wikipedia:Database_downloadBrowsing a wiki page is just like browsing a Wiki site, but the content is fetched and converted from a local dump file on request from the browser. XOWA. XOWA is a free, open-source application that helps download Wikipedia to a computer. Access all of Wikipedia offline, without an internet connection!
stackoverflow.com › questions › 63035431How do I download and work with wikipedia data dumps?
stackoverflow.com › questions › 63035431Jul 22, 2020 · You can either download the dumps from https://dumps.wikimedia.org/enwiki/ and parse them locally, or you can also contact the API.
www.howtogeek.com › 260023 › how-to-downloadHow to Download Wikipedia for Offline, At-Your-Fingertips Reading
www.howtogeek.com › 260023 › how-to-downloadSep 29, 2022 · Have you ever wished you could download Wikipedia in its entirety, and have a copy of it for yourself? There are a handful of ways to do just that --- all you need is a third-party program and about 150 gigabytes of storage.
dumps.wikimedia.orgWikimedia Downloads
dumps.wikimedia.orgDVD distributions. Available for some Wikipedia editions. Backup dumps of wikis which no longer exist. A complete copy of selected Wikimedia wikis which no longer exist and so which are no longer available via the main database backup dump page. This includes, in particular, the Sept. 11 wiki. Analytics data files.
www.investintech.com › resources › blogHow To Download And View Wikipedia Offline - Investintech.com
www.investintech.com › resources › blogHow To Download Your Own Wikipedia. To start getting your own offline Wikipedia, you’ll be downloading a Wikipedia database file and the WikiTaxi application from the Internet. The application has the offline Wikipedia viewer and importer you need.
github.com › daveshap › PlainTextWikipediaGitHub - daveshap/PlainTextWikipedia: Convert Wikipedia ...
github.com › daveshap › PlainTextWikipediaConvert Wikipedia database dumps into plain text files (JSON). This can parse literally all of Wikipedia with pretty high fidelity. There's a copy available on Kaggle Datasets.
digiwonk.gadgethacks.com › how-to › downloadHow to Download a Complete Offline Version of Wikipedia That ...
digiwonk.gadgethacks.com › how-to › downloadApr 23, 2013 · Having access to nearly all of Wikipedia's articles offline. There are a couple of ways you could do this, and I'll be showing you how to do it with ZIM files via Kiwix (Mac, Windows, Linux), by downloading XML files directly from Wikipedia, and reading XML files with WikiTaxi (Windows).
Searches related to how to download wikipedia dump files without a wiki article