This is the Database scanner subsection of the user manual for AutoWikiBrowser.
|
Chapters: | Core · Database scanner · Find and replace · Regular expressions · General fixes |
---|
Select the namespaces you want to search within. If none are selected, the search will include all available namespaces. Please note that your dump file might not contain data for every namespace available on your wiki.
Allows for pages with edit restrictions (semi-protected, fully protected etc.) to be searched for.
Some URL links to relevant dump help pages.
The speed of the database scanner mainly depends on two factors of the system it's run on:
Example performance: Intel Core i5 520M mobile CPU: maximum CPU usage and ~30 MB/s disk sequential read
So, with a reasonable 2010-era or later CPU, AWB will read the database XML dump file at around 30 MB/s and be CPU limited. Therefore, if reading the database file from a networked storage area, database scan performance will be reduced if the network transfer speed is below this speed. When reading the database XML dump file from a local disk, modern mechanical hard disks can normally provide sequential read speeds well above 30 MB/s, therefore the database scan speed will be CPU-limited.
The database scanner is multi-threaded: the database scanner uses the main thread to read the database XML file from disk, and additional thread(s) to search the articles based on the user's search criteria, total threads equalling the number of CPU cores (e.g. if quad core CPU without hyperthreading then 1 main and 3 secondary threads). The main thread will pause XML reading and contribute to article searching if the secondary threads get too far behind. This happens if searching the article based on the search criteria is slower than reading the article from the XML file; typically this is the case. For the example of the Core i5 520M this does occur, database scanner performance is limited to how fast all the threads can search the articles, so overall performance is limited to the multi-threaded performance of the CPU.
A CPU with more cores, and/or better performance from each core would improve database scanner performance.
This is the Database scanner subsection of the user manual for AutoWikiBrowser.
|
Chapters: | Core · Database scanner · Find and replace · Regular expressions · General fixes |
---|
Select the namespaces you want to search within. If none are selected, the search will include all available namespaces. Please note that your dump file might not contain data for every namespace available on your wiki.
Allows for pages with edit restrictions (semi-protected, fully protected etc.) to be searched for.
Some URL links to relevant dump help pages.
The speed of the database scanner mainly depends on two factors of the system it's run on:
Example performance: Intel Core i5 520M mobile CPU: maximum CPU usage and ~30 MB/s disk sequential read
So, with a reasonable 2010-era or later CPU, AWB will read the database XML dump file at around 30 MB/s and be CPU limited. Therefore, if reading the database file from a networked storage area, database scan performance will be reduced if the network transfer speed is below this speed. When reading the database XML dump file from a local disk, modern mechanical hard disks can normally provide sequential read speeds well above 30 MB/s, therefore the database scan speed will be CPU-limited.
The database scanner is multi-threaded: the database scanner uses the main thread to read the database XML file from disk, and additional thread(s) to search the articles based on the user's search criteria, total threads equalling the number of CPU cores (e.g. if quad core CPU without hyperthreading then 1 main and 3 secondary threads). The main thread will pause XML reading and contribute to article searching if the secondary threads get too far behind. This happens if searching the article based on the search criteria is slower than reading the article from the XML file; typically this is the case. For the example of the Core i5 520M this does occur, database scanner performance is limited to how fast all the threads can search the articles, so overall performance is limited to the multi-threaded performance of the CPU.
A CPU with more cores, and/or better performance from each core would improve database scanner performance.