From Wikipedia, the free encyclopedia
The Signpost
Single-page Edition
WP:POST/1
8 October 2014

 

2014-10-08

US opposition research firm blocked; Australian bushfires

GOP opposition research firm blocked for editing Wikipedia

America Rising is a Republican opposition research organization founded in 2013 by Matt Rhoades, campaign manager for US Presidential candidate Mitt Romney, and Joe Pounder and Tim Miller, media strategists for the Republican National Committee. It is split into two related organizations, a limited liability company (LLC) that coordinates with Republican candidates, and a political action committee (PAC) dedicated to spreading negative stories about Democratic candidates online and in the news media. Its webpage brags of "cataloguing every Democrat [sic] utterance" and it has become known for its "trackers" which follow and record video of the public comments of candidates in hopes of preserving evidence of an embarrassing misstep.

BuzzFeed reported (September 28) on three Wikipedia accounts that self-identified as belonging to America Rising staffers. Two of them, including one belonging to Miller, only made non-political edits. The third, User:Sprinkler Court, made 34 edits, openly identifying his or her conflict of interest in the edit summaries of his or her major edits. These edits were mostly to insert or advocate the insertion of material unflattering or potentially damaging to ten current Democratic candidates for the United States Senate. The 2014 midterm elections may result in Republicans gaining majority control over the Senate.

Candidate Senate race Content of edits
Mark Begich Alaska contents of a campaign advertisement
Bruce Braley Iowa comments about Republican Senator Chuck Grassley, Veterans' Affairs Committee attendance, dispute over a neighbor's therapeutic chickens
Mary Landrieu Louisiana claim that Landrieu does not own "developed property" in Louisiana, removed mention of the loss of her home due to Hurricane Katrina, internal investigation of flight billing
Michelle Nunn Georgia contents of a strategy memo leaked to the National Review, use of George H. W. Bush in campaign materials, Bush's endorsement of opponent David Perdue
Gary Peters Michigan employment at Central Michigan University, ownership of stock in French energy company Total SA
Mark Pryor Arkansas comments about opponent Tom Cotton
Jeanne Shaheen New Hampshire work for presidential campaigns of Jimmy Carter and Gary Hart, failure to release tax returns
Natalie Tennant West Virginia supposed errors by the Secretary of State of West Virginia's office, the presence of a federal inmate as a candidate on the WV presidential ballot
Mark Udall Colorado campaign events with President Barack Obama
John Walsh Montana report blocking his promotion to Brigadier General

Following the publication of the Buzzfeed article, Miller posted on Twitter "America Rising does wiki editing the right, honest way. Fact based edits. 100% transparent about our interest."

On October 5, Sprinkler Court was blocked for disruptive editing by User:Jehochman, who wrote on the account's talk page:

BuzzFeed reported (October 6) that Miller vowed to challenge the blocking through unspecified means:

The blocking was also reported on by The Hill. (G)

Australian minister under fire for bushfire comment

Bushfire in Lithgow, New South Wales, October 2013

In October 2013, in the wake of the devastating 2013 New South Wales bushfires, Christiana Figueres, Executive Secretary of the United Nations Framework Convention on Climate Change, warned of the increasing frequency and intensity of heat waves in Australia due to climate change. Prime Minister of Australia Tony Abbott, who famously called climate change "absolute crap" in 2009, dismissed the link between bushfires and climate change and said Figueres was "talking through her hat". In an interview that October with BBC NewsHour, Environment Minister Greg Hunt defended Abbott's comments, saying

At the time, Hunt's comments, especially his invocation of Wikipedia, were widely criticized (see previous Signpost coverage).

A year later, a report from the Sydney Morning Herald (October 7) widely circulating through Australian media reveals that Hunt had been briefed by the Bureau of Meteorology on climate change and bushfires three weeks prior to the Newshour interview where he indicated he had consulted Wikipedia on the topic. Contrary to his assertion in the interview that the Bureau was careful not to link the two, the confidential briefings indicate the Bureau's director informed Hunt "A number of more recent studies are drawing probabilistic links between more extreme seasonal heat records and climate change, including the Australian summer of 2012-13." (G)

In brief

Model of Wikipedia sculpture for Slubice, Poland
Wikipedia sculpture model
EFF Logo
EFF Logo
  • EFF organizes a Wikipedia Edit-a-thon for the Zone 9 Bloggers: April Glasere writes (October 3) at the website of the Electronic Frontier Foundation about their organization of an edit-a-thon to improve Wikipedia coverage of the Zone 9 bloggers and related topics. The Zone 9 bloggers are an Ethiopian bloggers group that, according to their Wikipedia article, are "a collective focusing on human rights, good governance, education, social justice, corruption and non-violence social transformation." Glasere noted that last July, the Ethiopian government arrested six Zone 9 bloggers, as well as three other journalists. The edit-a-thon took place as a Wikipedia meetup on Sunday, August 24. As for the results, Glasare reported:
The article was also republished by Geneva-based Cyber Ethiopia. (P)
Brooke Magnanti at Leeds Skeptics
Brooke Magnanti
  • Brooke Magnanti accused of plagarising from Wikipedia: Milo Yiannopoulos wrote about Jeremy Duns' compilation of alleged plagarisms by Brooke Magnanti on Breitbart.com (September 30). Dr. Magnanti, also known under the pen name of Belle de Jour, apparently paraphrased material from Wikipedia but failed to attribute it as her source. Yiannopoulos includes passages from Wikipedia and other sources along with the corresponding portions from Magnanti's columns containing what appear to be paraphrases with no acknowledgement of the source. Readers of the 2014-10-01 Signpost article " Let's get serious about plagiarism" will know that while Magnanti did not copy article passages directly, and while it is arguable whether she paraphrased too closely, "failing to acknowledge the source of quotations and borrowed ideas" would definitely make her a plagiarist. (P)
  • Unreliable sources: Craccum, the student magazine of the University of Auckland, writes about its attempts at "Vandalizing Wikipedia" (September 29). Conclusion: it's hard. (G)
OCLC logo
OCLC logo


Reader comments

2014-10-08

From a wordless novel to a coat of arms via New York City

This Signpost "Featured content" report covers material promoted from 28 September through 4 October, 2014. Anything in quotation marks is taken from the respective articles and lists; see their page histories for attribution.

Featured articles

Two featured articles were promoted this week.

Margaret Bondfield (1919)

Featured lists

Four featured lists were promoted this week.

Colin Firth and Helena Bonham Carter in a scene from The King's Speech

Featured pictures

Sixty-two featured pictures were promoted this week.

The coat of arms of the state of Illinois

Featured portals

The lead image for Portal:Children's literature

Two featured portals was promoted this week.




Reader comments

2014-10-08

Panic and denial

The first case of the Ebola virus on US shores sent people into a tizzy, rushing to their keyboards to try and learn what they could. Despite Ebola being actually quite difficult to transmit, the news media readily drummed up the apocalypse from this single case. Overcome with anxiety, Wikipedia users returned to their safe havens; TV, video games, novels and celebrities.

For the full top 25 list, see WP:TOP25. See this section for an explanation of any exclusions.

As prepared by Serendipodous, for the week of 28 September – 4 October, 2014, the 25 most popular articles on Wikipedia, as determined from the report of the 5,000 most viewed pages, were:

Rank Article Class Views Image Notes
1 Ebola virus disease B-class 1,153,294
This week, two weeks after an exasperated World Health Organization declared that the 2014 West Africa Ebola outbreak, already the largest in history by far, was spiralling out of control, the first American case was identified in Texas. This naturally sent people a bit panicky, even though it is virtually impossible for anyone to spread Ebola before they're symptomatic.
2 Amal Alamuddin C-class 1,129,602 The female population of the planet turned their furious gaze on this highly accomplished human rights lawyer, determined to ascertain whether she was truly good enough to take the most eligible bachelor on the planet away from them.
3 2014 Asian Games B-class 577,311
The 2014 Asian Games, a pan-Asian sporting event held every four years, commenced its 2014 edition in Incheon, South Korea on 19 September; the event will run through 4 October. The 2014 Games have 28 Olympic sports, as well as eight non-Olympic sports including baseball and sepak takraw (kick volleyball). The 2014 Asian Games medal table currently shows China at the 2014 Asian Games with a runaway lead, followed by South Korea, Japan, and Kazakhstan.
4 On the Internet, nobody knows you're a dog Good Article 571,290 This prescient 1993 cartoon from The New Yorker, which anticipated the rise of cyberstalking and virtual identity, gradually became an internet meme in its own right, as noted on a Reddit thread this week.
5 George Clooney C-class 527,193
Sorry ladies, but the last bachelor star is now off the market, having married Amal Alamuddin (see above).
6 Gotham (TV series) C-class 460,188
This American TV series is yet another reboot of the Batman franchise, and debuted on 22 September 2014.
7 Agents of S.H.I.E.L.D. B-class 433,869
Marvel Studios' expansion of its cinematic universe into television returned for its second season on 23 September.
8 Gone Girl (novel) B-class 418,892 David Fincher's hit movie spurred interest in the novel on which it is based.
9 Middle-earth: Shadow of Mordor Start-class 409,547 This open world action adventure game set in JRR Tolkien's Middle-earth was the biggest launch ever for a game set in that universe.
10 Deaths in 2014 List 400,447
The list of deaths in the current year is always a popular article.


Reader comments

2014-10-08

HHVM is the greatest thing since sliced bread

Legoktm is a platform engineer for the Wikimedia Foundation. He wrote this in his volunteer capacity with members of the MediaWiki Core team.

No seriously, it is.

HHVM stands for "HipHop Virtual Machine". It is an alternative PHP runtime, developed by Facebook and other open source contributors to improve the performance of PHP code. It stemmed from HipHop for PHP, an earlier project at Facebook which compiled PHP into C++ code. Compared to the default PHP runtime, it offers significant speedup for many operations.

In March 2014, a group of MediaWiki developers started working on ensuring that the codebase, along with the various PHP extensions used on Wikimedia servers, were compatible with HHVM. This involved making changes to MediaWiki, filing bugs with the HHVM project, and often also submitting patches for those bugs.

Users will see performance improvements in many places, especially when editing extremely large articles. If you're interested in helping the development team out with finding bugs, or just want your editing experience to be faster, you can enable the "HHVM" betafeature in your preferences.

We caught up with longtime MediaWiki developer and lead platform architect Tim Starling and asked him a few questions about the HHVM migration:

What is HHVM?

HHVM is a new implementation of the PHP language, written in C++. It has a rewritten runtime — that is to say, most of the functions exposed to PHP code have new implementations. It has a JIT compiler which can translate small snippets of PHP code to machine code.

What have the performance gains of HHVM been so far? Are they expected to increase over time?

At the moment, it is faster by roughly a factor of two. This is somewhat disappointing since the old HipHop compiler gave us a speedup of 3–5 depending on workload, although at the cost of an hour-long compile time. The performance gain is expected to increase over time, due to:
  • Deployment of RepoAuthoritative [1] mode. This involves analysing the PHP code for a few minutes prior to deployment, in order to generate more efficient HHVM bytecode. It is said to give a speedup of about 30%.
  • Optimisation of HHVM for MediaWiki's workload. Brett Simmers of Facebook has offered to spend some time on this after we have fully deployed HHVM.
  • Profiling and optimisation of MediaWiki running under HHVM. We expect big gains here, since not much profiling work has been done on MediaWiki while Ori and I have been working on HHVM. Even the Zend performance of MediaWiki is sub-optimal at present.
  • Ongoing performance work on HHVM by Facebook. Facebook have a performance team which constantly improves the performance of HHVM.

What sort of impact can users expect from the deployment of HHVM? What sort of issues might users run into?

We expect to see crashes and other fatal errors, and also more subtle bugs such as incorrect HTML generated by Lua or wikitext. Note that as we are rolling out HHVM, we are also upgrading from Ubuntu 12.04 to Ubuntu 14.04, which means different versions of many system libraries and utilities. When we move the image scalers to Ubuntu 14.04, there may be changes to SVG rendering.

What effort has gone into ensuring that HHVM performs well and is reliable, especially at Wikipedia's scale?

We have tested HHVM's performance by benchmarking, and also under real load by diverting a proportion of Wikipedia's traffic to a single HHVM backend server. We have some assurance from the fact that Facebook has been running HHVM for years in production, and their scale is significantly larger than our scale. Facebook's experience also gives us some confidence as to reliability.
With any open source project, it is difficult to give assurances as to reliability. PHP has not been uniformly reliable for us, and has presented all sorts of challenges for us over the years as we have scaled up. HHVM's architecture is built with much more awareness of the challenges of scaling than PHP was. So we have reason to think that HHVM will prove to be a more stable platform for a busy website in the long term.

What was the biggest challenge to rolling out HHVM?

Integration of Lua. I don't think anyone has integrated another language with HHVM before, in the same address space. We did it by rewriting HHVM's Zend compatibility layer, allowing our existing Lua extension for Zend to be also compiled against HHVM, with only a few instances of cheating (#ifdef).

What is Hack and do you think it will affect Wikimedia development?

Hack is the new name for the language extensions that Facebook has progressively added to the syntax of PHP over the last four years in HipHop Compiler and HHVM. It also refers to the static type checker that they have recently introduced. Hack allows types to be specified for function return values, and extends the existing support for specified types in function parameters. The net effect is to allow an existing PHP codebase to be progressively migrated to strong typing, with many type checks being done pre-commit instead of at runtime. This is beneficial for developers of large applications, and helps to avoid errors being seen by users at runtime.
For now, we are committed to compatibility with PHP, and so it is difficult for us to make use of Hack's language extensions, except in WMF-specific services and tools. I would love to see a move towards language harmonization by PHP towards Hack – for example, return type hints could easily be added. I'm not sure of the reason for the split, since PHP does not strike me as an especially conservative community.

Currently logged-out users have a significantly faster experience than logged-in users. Is it realistic to expect that logged-in users will one day have the same experience as logged-out users? If so, when?

HHVM by itself will not provide performance parity for most users. It will help to reduce parser cache hit times, and for many articles, for users near our main US data centre, the difference between logged-in and logged-out page view times may be imperceptible. But for users outside the US, we will continue to serve logged-out page views from the nearest cache, whereas logged-in page view requests are forwarded to Virginia, which will add 100-300ms due to the speed of light delay. This is not easy to fix.
Parser cache misses could be reduced or eliminated, but page save times are somewhat more difficult, in my opinion, especially if we continue to support pre-edit spam and vandalism detection. Some amount of processing is needed to detect vandalism – is it appropriate to pretend that we have saved the page when such processing is still going on, and to send a message later if the edit is rejected? And if we do that, do we show the updated site to the user while processing is in progress?

After HHVM is fully deployed, what are the next big projects to improve performance?

We are planning to work on editor performance, especially VisualEditor. Also, as previously noted, there will be ongoing profiling and optimisation work which will cumulatively improve performance.


More information is available at mw:HHVM/About, and information about the current development process can be found at mw:HHVM.

P.S.: If you too find HHVM to be awesome, you can leave your thanks to the developers here.

Reader comments

If articles have been updated, you may need to refresh the single-page edition.
From Wikipedia, the free encyclopedia
The Signpost
Single-page Edition
WP:POST/1
8 October 2014

 

2014-10-08

US opposition research firm blocked; Australian bushfires

GOP opposition research firm blocked for editing Wikipedia

America Rising is a Republican opposition research organization founded in 2013 by Matt Rhoades, campaign manager for US Presidential candidate Mitt Romney, and Joe Pounder and Tim Miller, media strategists for the Republican National Committee. It is split into two related organizations, a limited liability company (LLC) that coordinates with Republican candidates, and a political action committee (PAC) dedicated to spreading negative stories about Democratic candidates online and in the news media. Its webpage brags of "cataloguing every Democrat [sic] utterance" and it has become known for its "trackers" which follow and record video of the public comments of candidates in hopes of preserving evidence of an embarrassing misstep.

BuzzFeed reported (September 28) on three Wikipedia accounts that self-identified as belonging to America Rising staffers. Two of them, including one belonging to Miller, only made non-political edits. The third, User:Sprinkler Court, made 34 edits, openly identifying his or her conflict of interest in the edit summaries of his or her major edits. These edits were mostly to insert or advocate the insertion of material unflattering or potentially damaging to ten current Democratic candidates for the United States Senate. The 2014 midterm elections may result in Republicans gaining majority control over the Senate.

Candidate Senate race Content of edits
Mark Begich Alaska contents of a campaign advertisement
Bruce Braley Iowa comments about Republican Senator Chuck Grassley, Veterans' Affairs Committee attendance, dispute over a neighbor's therapeutic chickens
Mary Landrieu Louisiana claim that Landrieu does not own "developed property" in Louisiana, removed mention of the loss of her home due to Hurricane Katrina, internal investigation of flight billing
Michelle Nunn Georgia contents of a strategy memo leaked to the National Review, use of George H. W. Bush in campaign materials, Bush's endorsement of opponent David Perdue
Gary Peters Michigan employment at Central Michigan University, ownership of stock in French energy company Total SA
Mark Pryor Arkansas comments about opponent Tom Cotton
Jeanne Shaheen New Hampshire work for presidential campaigns of Jimmy Carter and Gary Hart, failure to release tax returns
Natalie Tennant West Virginia supposed errors by the Secretary of State of West Virginia's office, the presence of a federal inmate as a candidate on the WV presidential ballot
Mark Udall Colorado campaign events with President Barack Obama
John Walsh Montana report blocking his promotion to Brigadier General

Following the publication of the Buzzfeed article, Miller posted on Twitter "America Rising does wiki editing the right, honest way. Fact based edits. 100% transparent about our interest."

On October 5, Sprinkler Court was blocked for disruptive editing by User:Jehochman, who wrote on the account's talk page:

BuzzFeed reported (October 6) that Miller vowed to challenge the blocking through unspecified means:

The blocking was also reported on by The Hill. (G)

Australian minister under fire for bushfire comment

Bushfire in Lithgow, New South Wales, October 2013

In October 2013, in the wake of the devastating 2013 New South Wales bushfires, Christiana Figueres, Executive Secretary of the United Nations Framework Convention on Climate Change, warned of the increasing frequency and intensity of heat waves in Australia due to climate change. Prime Minister of Australia Tony Abbott, who famously called climate change "absolute crap" in 2009, dismissed the link between bushfires and climate change and said Figueres was "talking through her hat". In an interview that October with BBC NewsHour, Environment Minister Greg Hunt defended Abbott's comments, saying

At the time, Hunt's comments, especially his invocation of Wikipedia, were widely criticized (see previous Signpost coverage).

A year later, a report from the Sydney Morning Herald (October 7) widely circulating through Australian media reveals that Hunt had been briefed by the Bureau of Meteorology on climate change and bushfires three weeks prior to the Newshour interview where he indicated he had consulted Wikipedia on the topic. Contrary to his assertion in the interview that the Bureau was careful not to link the two, the confidential briefings indicate the Bureau's director informed Hunt "A number of more recent studies are drawing probabilistic links between more extreme seasonal heat records and climate change, including the Australian summer of 2012-13." (G)

In brief

Model of Wikipedia sculpture for Slubice, Poland
Wikipedia sculpture model
EFF Logo
EFF Logo
  • EFF organizes a Wikipedia Edit-a-thon for the Zone 9 Bloggers: April Glasere writes (October 3) at the website of the Electronic Frontier Foundation about their organization of an edit-a-thon to improve Wikipedia coverage of the Zone 9 bloggers and related topics. The Zone 9 bloggers are an Ethiopian bloggers group that, according to their Wikipedia article, are "a collective focusing on human rights, good governance, education, social justice, corruption and non-violence social transformation." Glasere noted that last July, the Ethiopian government arrested six Zone 9 bloggers, as well as three other journalists. The edit-a-thon took place as a Wikipedia meetup on Sunday, August 24. As for the results, Glasare reported:
The article was also republished by Geneva-based Cyber Ethiopia. (P)
Brooke Magnanti at Leeds Skeptics
Brooke Magnanti
  • Brooke Magnanti accused of plagarising from Wikipedia: Milo Yiannopoulos wrote about Jeremy Duns' compilation of alleged plagarisms by Brooke Magnanti on Breitbart.com (September 30). Dr. Magnanti, also known under the pen name of Belle de Jour, apparently paraphrased material from Wikipedia but failed to attribute it as her source. Yiannopoulos includes passages from Wikipedia and other sources along with the corresponding portions from Magnanti's columns containing what appear to be paraphrases with no acknowledgement of the source. Readers of the 2014-10-01 Signpost article " Let's get serious about plagiarism" will know that while Magnanti did not copy article passages directly, and while it is arguable whether she paraphrased too closely, "failing to acknowledge the source of quotations and borrowed ideas" would definitely make her a plagiarist. (P)
  • Unreliable sources: Craccum, the student magazine of the University of Auckland, writes about its attempts at "Vandalizing Wikipedia" (September 29). Conclusion: it's hard. (G)
OCLC logo
OCLC logo


Reader comments

2014-10-08

From a wordless novel to a coat of arms via New York City

This Signpost "Featured content" report covers material promoted from 28 September through 4 October, 2014. Anything in quotation marks is taken from the respective articles and lists; see their page histories for attribution.

Featured articles

Two featured articles were promoted this week.

Margaret Bondfield (1919)

Featured lists

Four featured lists were promoted this week.

Colin Firth and Helena Bonham Carter in a scene from The King's Speech

Featured pictures

Sixty-two featured pictures were promoted this week.

The coat of arms of the state of Illinois

Featured portals

The lead image for Portal:Children's literature

Two featured portals was promoted this week.




Reader comments

2014-10-08

Panic and denial

The first case of the Ebola virus on US shores sent people into a tizzy, rushing to their keyboards to try and learn what they could. Despite Ebola being actually quite difficult to transmit, the news media readily drummed up the apocalypse from this single case. Overcome with anxiety, Wikipedia users returned to their safe havens; TV, video games, novels and celebrities.

For the full top 25 list, see WP:TOP25. See this section for an explanation of any exclusions.

As prepared by Serendipodous, for the week of 28 September – 4 October, 2014, the 25 most popular articles on Wikipedia, as determined from the report of the 5,000 most viewed pages, were:

Rank Article Class Views Image Notes
1 Ebola virus disease B-class 1,153,294
This week, two weeks after an exasperated World Health Organization declared that the 2014 West Africa Ebola outbreak, already the largest in history by far, was spiralling out of control, the first American case was identified in Texas. This naturally sent people a bit panicky, even though it is virtually impossible for anyone to spread Ebola before they're symptomatic.
2 Amal Alamuddin C-class 1,129,602 The female population of the planet turned their furious gaze on this highly accomplished human rights lawyer, determined to ascertain whether she was truly good enough to take the most eligible bachelor on the planet away from them.
3 2014 Asian Games B-class 577,311
The 2014 Asian Games, a pan-Asian sporting event held every four years, commenced its 2014 edition in Incheon, South Korea on 19 September; the event will run through 4 October. The 2014 Games have 28 Olympic sports, as well as eight non-Olympic sports including baseball and sepak takraw (kick volleyball). The 2014 Asian Games medal table currently shows China at the 2014 Asian Games with a runaway lead, followed by South Korea, Japan, and Kazakhstan.
4 On the Internet, nobody knows you're a dog Good Article 571,290 This prescient 1993 cartoon from The New Yorker, which anticipated the rise of cyberstalking and virtual identity, gradually became an internet meme in its own right, as noted on a Reddit thread this week.
5 George Clooney C-class 527,193
Sorry ladies, but the last bachelor star is now off the market, having married Amal Alamuddin (see above).
6 Gotham (TV series) C-class 460,188
This American TV series is yet another reboot of the Batman franchise, and debuted on 22 September 2014.
7 Agents of S.H.I.E.L.D. B-class 433,869
Marvel Studios' expansion of its cinematic universe into television returned for its second season on 23 September.
8 Gone Girl (novel) B-class 418,892 David Fincher's hit movie spurred interest in the novel on which it is based.
9 Middle-earth: Shadow of Mordor Start-class 409,547 This open world action adventure game set in JRR Tolkien's Middle-earth was the biggest launch ever for a game set in that universe.
10 Deaths in 2014 List 400,447
The list of deaths in the current year is always a popular article.


Reader comments

2014-10-08

HHVM is the greatest thing since sliced bread

Legoktm is a platform engineer for the Wikimedia Foundation. He wrote this in his volunteer capacity with members of the MediaWiki Core team.

No seriously, it is.

HHVM stands for "HipHop Virtual Machine". It is an alternative PHP runtime, developed by Facebook and other open source contributors to improve the performance of PHP code. It stemmed from HipHop for PHP, an earlier project at Facebook which compiled PHP into C++ code. Compared to the default PHP runtime, it offers significant speedup for many operations.

In March 2014, a group of MediaWiki developers started working on ensuring that the codebase, along with the various PHP extensions used on Wikimedia servers, were compatible with HHVM. This involved making changes to MediaWiki, filing bugs with the HHVM project, and often also submitting patches for those bugs.

Users will see performance improvements in many places, especially when editing extremely large articles. If you're interested in helping the development team out with finding bugs, or just want your editing experience to be faster, you can enable the "HHVM" betafeature in your preferences.

We caught up with longtime MediaWiki developer and lead platform architect Tim Starling and asked him a few questions about the HHVM migration:

What is HHVM?

HHVM is a new implementation of the PHP language, written in C++. It has a rewritten runtime — that is to say, most of the functions exposed to PHP code have new implementations. It has a JIT compiler which can translate small snippets of PHP code to machine code.

What have the performance gains of HHVM been so far? Are they expected to increase over time?

At the moment, it is faster by roughly a factor of two. This is somewhat disappointing since the old HipHop compiler gave us a speedup of 3–5 depending on workload, although at the cost of an hour-long compile time. The performance gain is expected to increase over time, due to:
  • Deployment of RepoAuthoritative [1] mode. This involves analysing the PHP code for a few minutes prior to deployment, in order to generate more efficient HHVM bytecode. It is said to give a speedup of about 30%.
  • Optimisation of HHVM for MediaWiki's workload. Brett Simmers of Facebook has offered to spend some time on this after we have fully deployed HHVM.
  • Profiling and optimisation of MediaWiki running under HHVM. We expect big gains here, since not much profiling work has been done on MediaWiki while Ori and I have been working on HHVM. Even the Zend performance of MediaWiki is sub-optimal at present.
  • Ongoing performance work on HHVM by Facebook. Facebook have a performance team which constantly improves the performance of HHVM.

What sort of impact can users expect from the deployment of HHVM? What sort of issues might users run into?

We expect to see crashes and other fatal errors, and also more subtle bugs such as incorrect HTML generated by Lua or wikitext. Note that as we are rolling out HHVM, we are also upgrading from Ubuntu 12.04 to Ubuntu 14.04, which means different versions of many system libraries and utilities. When we move the image scalers to Ubuntu 14.04, there may be changes to SVG rendering.

What effort has gone into ensuring that HHVM performs well and is reliable, especially at Wikipedia's scale?

We have tested HHVM's performance by benchmarking, and also under real load by diverting a proportion of Wikipedia's traffic to a single HHVM backend server. We have some assurance from the fact that Facebook has been running HHVM for years in production, and their scale is significantly larger than our scale. Facebook's experience also gives us some confidence as to reliability.
With any open source project, it is difficult to give assurances as to reliability. PHP has not been uniformly reliable for us, and has presented all sorts of challenges for us over the years as we have scaled up. HHVM's architecture is built with much more awareness of the challenges of scaling than PHP was. So we have reason to think that HHVM will prove to be a more stable platform for a busy website in the long term.

What was the biggest challenge to rolling out HHVM?

Integration of Lua. I don't think anyone has integrated another language with HHVM before, in the same address space. We did it by rewriting HHVM's Zend compatibility layer, allowing our existing Lua extension for Zend to be also compiled against HHVM, with only a few instances of cheating (#ifdef).

What is Hack and do you think it will affect Wikimedia development?

Hack is the new name for the language extensions that Facebook has progressively added to the syntax of PHP over the last four years in HipHop Compiler and HHVM. It also refers to the static type checker that they have recently introduced. Hack allows types to be specified for function return values, and extends the existing support for specified types in function parameters. The net effect is to allow an existing PHP codebase to be progressively migrated to strong typing, with many type checks being done pre-commit instead of at runtime. This is beneficial for developers of large applications, and helps to avoid errors being seen by users at runtime.
For now, we are committed to compatibility with PHP, and so it is difficult for us to make use of Hack's language extensions, except in WMF-specific services and tools. I would love to see a move towards language harmonization by PHP towards Hack – for example, return type hints could easily be added. I'm not sure of the reason for the split, since PHP does not strike me as an especially conservative community.

Currently logged-out users have a significantly faster experience than logged-in users. Is it realistic to expect that logged-in users will one day have the same experience as logged-out users? If so, when?

HHVM by itself will not provide performance parity for most users. It will help to reduce parser cache hit times, and for many articles, for users near our main US data centre, the difference between logged-in and logged-out page view times may be imperceptible. But for users outside the US, we will continue to serve logged-out page views from the nearest cache, whereas logged-in page view requests are forwarded to Virginia, which will add 100-300ms due to the speed of light delay. This is not easy to fix.
Parser cache misses could be reduced or eliminated, but page save times are somewhat more difficult, in my opinion, especially if we continue to support pre-edit spam and vandalism detection. Some amount of processing is needed to detect vandalism – is it appropriate to pretend that we have saved the page when such processing is still going on, and to send a message later if the edit is rejected? And if we do that, do we show the updated site to the user while processing is in progress?

After HHVM is fully deployed, what are the next big projects to improve performance?

We are planning to work on editor performance, especially VisualEditor. Also, as previously noted, there will be ongoing profiling and optimisation work which will cumulatively improve performance.


More information is available at mw:HHVM/About, and information about the current development process can be found at mw:HHVM.

P.S.: If you too find HHVM to be awesome, you can leave your thanks to the developers here.

Reader comments

If articles have been updated, you may need to refresh the single-page edition.

Videos

Youtube | Vimeo | Bing

Websites

Google | Yahoo | Bing

Encyclopedia

Google | Yahoo | Bing

Facebook