This is the
talk page for discussing improvements to the
Database download page. |
|
Archives: 1, 2, 3Auto-archiving period: 365 days |
|
|||
This page has archives. Sections older than 365 days may be automatically archived by Lowercase sigmabot III when more than 4 sections are present. |
@ Philoserf: I note that your recent edits removed many of the inline external links. This makes the page much harder to use: statements such as "If it doesn't work, see the forums" and "The SQL file used to initialize a MediaWiki database can be found here" aren't useful without their external link. This is technical documentation aimed at technical users, not an encyclopedia article, so I'm not convinced that applying the MOS strictly is a good idea. Are you able to redo your edit without losing the external links? -- John of Reading ( talk) 08:04, 8 February 2023 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Unfortunately the link to the wiki-as-ebook is no longer available. Link needs to be removed.
E-book The wiki-as-ebook store provides ebooks created from a large set of Wikipedia articles with grayscale images for e-book-readers (2013). Tibor Brink ( talk) 15:44, 1 March 2023 (UTC)
Update: this is still listed as an option in the "Offline Wikipedia Reader" list at the top, and that should be removed as well. — Preceding unsigned comment added by Dfhci ( talk • contribs) 14:09, 17 March 2024 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
this pages states that pages-meta-current.xml.bz2 is over 19GB compressed witch although true is highly misleading due to it now being 36.6GB is size when compressed. 73.249.220.113 ( talk) 02:31, 15 October 2023 (UTC)
The "How to use multistream?" shows
" For multistream, you can get an index file, pages-articles-multistream-index.txt.bz2. The first field of this index is the number of bytes to seek into the compressed archive pages-articles-multistream.xml.bz2, the second is the article ID, the third the article title.
Cut a small part out of the archive with dd using the byte offset as found in the index. You could then either bzip2 decompress it or use bzip2recover, and search the first file for the article ID.
See https://docs.python.org/3/library/bz2.html#bz2.BZ2Decompressor for info about such multistream files and about how to decompress them with python; see also https://gerrit.wikimedia.org/r/plugins/gitiles/operations/dumps/+/ariel/toys/bz2multistream/README.txt and related files for an old working toy. "
I have the index and the multistream, and I can make a live usb flash drive with https://trisquel.info/en/wiki/how-create-liveusb
lsblk
umount /dev/sdX*
sudo dd if=/path/to/image.iso of=/dev/sdX bs=8M;sync
,but I do not know how to use dd that well to
"Cut a small part out of the archive with dd using the byte offset as found in the index." than "You could then either bzip2 decompress it or use bzip2recover, and search the first file for the article ID. "
Is there any video or more information on Wikipedia about how to do this, so I can look at Wikipedia pages, or at least the text off-line?
Thank you for your time.
Other Cody ( talk) 22:46, 4 December 2023 (UTC)
This is the
talk page for discussing improvements to the
Database download page. |
|
Archives: 1, 2, 3Auto-archiving period: 365 days |
|
|||
This page has archives. Sections older than 365 days may be automatically archived by Lowercase sigmabot III when more than 4 sections are present. |
@ Philoserf: I note that your recent edits removed many of the inline external links. This makes the page much harder to use: statements such as "If it doesn't work, see the forums" and "The SQL file used to initialize a MediaWiki database can be found here" aren't useful without their external link. This is technical documentation aimed at technical users, not an encyclopedia article, so I'm not convinced that applying the MOS strictly is a good idea. Are you able to redo your edit without losing the external links? -- John of Reading ( talk) 08:04, 8 February 2023 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
Unfortunately the link to the wiki-as-ebook is no longer available. Link needs to be removed.
E-book The wiki-as-ebook store provides ebooks created from a large set of Wikipedia articles with grayscale images for e-book-readers (2013). Tibor Brink ( talk) 15:44, 1 March 2023 (UTC)
Update: this is still listed as an option in the "Offline Wikipedia Reader" list at the top, and that should be removed as well. — Preceding unsigned comment added by Dfhci ( talk • contribs) 14:09, 17 March 2024 (UTC)
This
edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
this pages states that pages-meta-current.xml.bz2 is over 19GB compressed witch although true is highly misleading due to it now being 36.6GB is size when compressed. 73.249.220.113 ( talk) 02:31, 15 October 2023 (UTC)
The "How to use multistream?" shows
" For multistream, you can get an index file, pages-articles-multistream-index.txt.bz2. The first field of this index is the number of bytes to seek into the compressed archive pages-articles-multistream.xml.bz2, the second is the article ID, the third the article title.
Cut a small part out of the archive with dd using the byte offset as found in the index. You could then either bzip2 decompress it or use bzip2recover, and search the first file for the article ID.
See https://docs.python.org/3/library/bz2.html#bz2.BZ2Decompressor for info about such multistream files and about how to decompress them with python; see also https://gerrit.wikimedia.org/r/plugins/gitiles/operations/dumps/+/ariel/toys/bz2multistream/README.txt and related files for an old working toy. "
I have the index and the multistream, and I can make a live usb flash drive with https://trisquel.info/en/wiki/how-create-liveusb
lsblk
umount /dev/sdX*
sudo dd if=/path/to/image.iso of=/dev/sdX bs=8M;sync
,but I do not know how to use dd that well to
"Cut a small part out of the archive with dd using the byte offset as found in the index." than "You could then either bzip2 decompress it or use bzip2recover, and search the first file for the article ID. "
Is there any video or more information on Wikipedia about how to do this, so I can look at Wikipedia pages, or at least the text off-line?
Thank you for your time.
Other Cody ( talk) 22:46, 4 December 2023 (UTC)