WIKIPEDIA TALK:AUTOCONFIRMED ARTICLE CREATION TRIAL POST-TRIAL RESEARCH REPORT

Our suggestions

Thanks for putting together this report. Regarding "Our suggestions..." -- whose suggestions? As this is in projectspace, it's odd to see a page like this and have it express the suggestions of a particular group of people in projectspace voice, if that makes sense. — Rhododendrites ^talk \\ 00:37, 14 March 2018 (UTC) reply

Rhododendrites: Yes, thanks for pointing that out; silly of us to post without any byline. :) I put a line at the top -- not very elegant, but we'll figure out a better place for it. -- DannyH (WMF) ( talk) 14:09, 14 March 2018 (UTC) reply

It's great to see this report. But the headline number seems to be: 1.2%

Under 2% of drafts being published is about as far from wiki as can be. Especially when you consider informal policies like the idea that everything in the draft namespace can be deleted if it doesn't eventually make it to the main space.

A more fundamental problem is that we have only a single binary option for keeping or deleting articles (with tiny variations of highlighting good or featured articles; though most readers won't notice at all). We would be much better served by a dynamic spectrum of [highlighted] to [hidden] articles, with deletion not even an option unless articles are harmful to others or illegal. Authors should be able to always see their own articles, even if they've been hidden from view for most readers, whenever those authors return to work on the articles again: be it days or years later.

A reversible "hiding" action, by virtue of being reversible and not so painful/confusing to the authors working on an article, would also require less consensus and less decision-making overhead.

Enabling a "visibility" parameter that could be varied by editor input is the sort of fundamental improvements to MediaWiki that I'd love to see. Experimenting with shifting around steps in the current heavy workflow is extremely interesting, but may not have the same impact over time.

Finally, if results were evident within 2 months, is 6 months the right length of time for such studies or can we do more of them more quickly? – SJ + 01:52, 14 March 2018 (UTC) reply

I think the inflated standards for Drafts to be "good enough" to publish is a problem, and also the lack of collaboration in Drafts space. We end up deleting a lot of drafts just because they don't have enough citations, even if the basic article is solid and has potential to improve. I hope that both the community and the WMF will look into these issues over the coming year. The hiding idea is interesting, although I wonder if it would just lead to never-ending debates about the visibility levels of all our articles. Regarding the length of ACTRIAL, we (Community Tech) suggested 2 months at the outset, but the community insisted on 6 months since that's what the RfC said. It's very rare on Wikipedia that you actually need 6 months to detect a significant effect from a change. Usually a month or two is enough. Kaldari ( talk) 05:16, 14 March 2018 (UTC) reply

Length of Trial and 'shift from NPP to AfC': I agree that ACTRIAL as an experiment may have been OK to run for only two months to get the data we needed, but the 6 month length did give New Page Patrol a chance to catch up on a backlog of unreviewed indexed articles that had persisted for well over a year, and for that I am extremely grateful. Given the result here, further studies being shorter intervals seem like they would work just fine. I would like to point out that many, perhaps most, AfC reviewers are also New Page Patrollers, so it isn't always a zero sum game when it comes to backlogs and workload shifting from one to the other. I suspect that NPP saw somewhat lower reviewer activity during ACTRIAL (even in spite of ~100 new recruits during the drive) at least in part by a shift of some NPPers to AfC reviewing instead.

AfC publication rate: I had no idea that the publication rate of AfC was so low (1.2%!!!). I'd like to see more research into this. Is it driven primarily by users giving up on a submission after being told by the first reviewer point blank that it is not a suitable topic? If so, that might be a positive thing: perhaps being told once by a knowledgeable user that your topic isn't notable is better than having your article nuked from namespace, I don't know. In any case it does seem to indicate that the vast majority of AfC's time is being spent telling people off for submitting proposals that are not of good quality. This report is correct when it surmises that "article creation isn't easy"; in my experience at editathons and interacting with new users at NPP and AfC, new editors submitting articles need to be either very dedicated to push through the steep learning curve, or else coached very carefully by more experienced editors.

My largest concern: I fear that some of AfC's low publication rate might be driven by topics that are notable (if you searched deeply and found the sources) but that AfC reviewers are declining based on a lack of sources currently in the article, i.e. not performing WP:BEFORE. Identifying whether this is the case or not should be a top-level concern and would indicate a major need for AfC reform. In particular, the AfC reviewing instructions do not explicitly tell reviewers to make a search to find sources themselves prior to declining on notability reasons, which could potentially lead to a lot of notable topics being declined as 'not enough sources to demonstrate notability' based solely on the submitter being a complete noob at finding appropriate sources (safe to assume based on personal experience at editathons that this is the case for nearly all new editors). At NPP on the other hand, we have to perform WP:BEFORE, meaning that notable topics rarely get deleted (exceptions include blank/nocontext articles, BLPPRODs, Copyvios, and A7 deletions).

I think I might have to do some random reviewing at AfC; review a few hundred articles and see how many would be declined based on lack of sources currently in the article (per current AfC reviewing instructions), but which are revealed to be notable if the reviewer performs detailed searches before declining. If it turns out to be a lot, then we need serious AfC reform. — Insertcleverphrasehere ^{(
or here)} 06:14, 14 March 2018 (UTC) reply

I think a large part of the low acceptance rate at AfC is that if the reviewer accepts a draft, they are de facto "responsible" for its outcome. It's a lot of work on a topic you may not know much about, and if a handful of them get sent to AfD (or worse CSD) the community takes it out on the reviewer. That's not wrong, but for better or worse this is a place where reputation matters, and I imagine folks are loathe to put it on the line for a page they don't care about. ~ Amory ( u • t • c) 12:29, 14 March 2018 (UTC) reply

Lack of resolve to make a decision is a problem we often have at NPP on articles that are not particularly clear one way or the other, which tend to languish a bit in the backlog, though the new-ish flowchart has definitely helped alleviate that problem somewhat I think. I have begun working through a randomised sample of articles from the AfC pending submissions list (every 10th article in the list), I'll let you know the results, but so far they are not looking good. Of the first 10, I assessed 6 as notable topics, and 2 additionally as 'borderline'. I am finding it hard to believe that figure of 1.2% after seeing these preliminary results. Could someone give me a bit more info on how that figure was landed upon? — Insertcleverphrasehere ^{(
or here)} 13:54, 14 March 2018 (UTC) reply

1.2% is for all of Drafts, not just articles submitted to AfC. FWIW, I have definitely seen notable topics rejected from AfC (due to inadequate sources), but I'm not sure how bad the problem really is. Kaldari ( talk) 16:21, 14 March 2018 (UTC) reply

@ Kaldari can you clarify; when you say all drafts, are you saying all 'draftspace' drafts, or does this number also contain unserspace drafts? Is there a way of collecting data on the % of articles that are promoted from the total pool of drafts that are or have been at one point submitted through AfC? That would be the key number we are looking for. I pulled up 50 pages from Special:Random/Draft, and of those almost exactly half (26) had been submitted to AfC at some point, so at the very lease the 1.2% figure is out by at least a factor of two when it comes to AfC submitted drafts (AfC can't reasonably be expected to be responsible for anything that wasn't submitted to them, so ones that haven't been submitted to AfC shouldn't be considered as part of the numbers). I will also note that this also doesn't take account for articles that are created and developed in draft space, but then never moved to main space because the creator did a copy/paste pagemove to mainspace (for example see Draft:Cyclone David/Friederike and Cyclone David/Friederike). — Insertcleverphrasehere ^{(
or here)} 19:21, 14 March 2018 (UTC) reply

@ Insertcleverphrasehere: It's 1.2% of pages in the Drafts namespace. Kaldari ( talk) 19:47, 14 March 2018 (UTC) reply

Thanks for clarifying. Given the proportion of about half of draftspace creations hitting AfC at some point, that means that the true percentage of AfC drafts moved to mainspace is up somewhere around 2.4% or higher (higher due to copy/paste moves). — Insertcleverphrasehere ^{(
or here)} 19:55, 14 March 2018 (UTC) reply

@ Insertcleverphrasehere: I thought I'd stop by after checking our numbers to make sure they're good. First, a methodological clarification: we only looked at pages created in the Draft namespace, because going through all revisions of the User namespace to identify AfC submissions were outside our scope. Kaldari's correct that it's 1.2% of all Drafts we had in our dataset. For our analysis of AfC, I created a dataset of drafts created and submitted after July 1, 2014 and before December 1, 2017. That dataset contains 64,353 pages submitted to AfC, and 520 (0.8%) were moved to the Main namespace. Regards, Nettrom ( talk) 21:48, 14 March 2018 (UTC) reply

Most userspace submissions that are not test edits get moved to Draft. So ignoring userspace is fine. I think we can see how many accepts there have been somewhere but 520 is clearly too low for 3.5 years only about 0.4 per day! That can't be right. Legacypac ( talk) 22:07, 14 March 2018 (UTC) reply

@ Legacypac: A close reading of Nettrom's statement tells us that they captured only those drafts that were created in Draft space. Perhaps Nettrom's statement was imprecise and they really did capture User space drafts that later got moved to Draft space. But if not, do we have any feel for how many (i.e., what proportion) of AfC submissions began life in User space? NewYorkActuary ( talk) 23:57, 14 March 2018 (UTC) reply

Any idea why those numbers are so different to the observed current stats of ~10% accepted? Is it possible that AfC is far less picky now than it was in the past? Is it possible to crunch the numbers for the past 6 months and see what kind of percentage we are talking about? — Insertcleverphrasehere ^{(
or here)} 21:53, 14 March 2018 (UTC) reply

@ Nettrom. I'm confused. Is your data set small random sample of the total or is it supposed to be the total? Looking at the page history of Wikipedia:Articles_for_creation/recent, which is edited once every time an article is accepted, the page stats indicate that the page has been edited 50,000 times since 2010, indicating at least that number of accepted AfC submissions through the AFCH script. Category:Accepted AfC submissions has 80,000 pages in it, indicating again that the numbers quoted above are off by at around 2 orders of magnitude from the total numbers. — Insertcleverphrasehere ^{(
or here)} 22:13, 14 March 2018 (UTC) reply

Authors should be able to always see their own articles, even if they've been hidden from view for most readers, whenever those authors return to work on the articles again: be it days or years later will simply never happen, and that is from the WMF. Access to delete pages requires RfA or an RfA equivalent process. TonyBallioni ( talk) 18:05, 14 March 2018 (UTC) reply
- He specifically mentions introducing more states than just published for all and (all revisions) deleted for all-but-admins. The WMF cannot have an opinion on an additional state that not yet exists. I find it an interesting direction of thought. It's akin to Pending Changes, but then for article existence. The engineering effort would be substantial however. Should be interesting to work out such an idea. — TheDJ ( talk • contribs) 19:06, 14 March 2018 (UTC) reply

We essentially already have this at NPP with the index system. Unreviewed articles that are less than 90 days old are not indexed for search engines, making them difficult to find (they won't show up in google until a NPR right holder marks them as reviewed or patrolled. You can still get to the article via in-wiki links or through the Wikipedia search bar however. — Insertcleverphrasehere ^{(
or here)} 19:25, 14 March 2018 (UTC) reply

This is basically the same thing as allowing non-admins who reach X edits or have X bit access to deleted pages. The WMF will never waive the legal protection that comes with restricting access to “hidden” content. There would need to be an RfA equivalent process even for this. The issue isn’t with “deleted” vs “hidden” it’s with publicly viewable vs non-public. Add to that the fact that even if the WMF somehow dropped its longstanding legal concern with allowing access to non-public pages and revisions, the en.wiki community would be very unlikely to accept it beyond what we already have (i.e. a non-indexable and difficult to search draft space.) 23:20, 14 March 2018 (UTC)

As far as I can see, it's not the same at all. a) The WMF will never waive the legal protection...the issue is with publicly viewable vs non-public Untrue. There's no legal or other reason not to have information hidden from default view for [some or all] readers. cf. the various iterations of Flagged Revisions. b) There would need to be an RfA equivalent process even for this. No need; unless a wiki chooses to make it so. c) the en.wiki community would be very unlikely to accept it beyond what we already have Let's find out. The current system is certainly needlessly frustrating and time-wasting for many authors. – SJ + 23:28, 16 March 2018 (UTC) reply

Completely untrue, and I'm honestly amazed a former WMF board member does not realize this. Requiring public vetting for hidden content (deletion) is how the Wikimedia Foundation remains exempt from the legal liability of admins having access to Special:Undelete, and it is the reason RfA or an equivalent process of public community vetting is required by the Wikimedia Foundation to access Special:Undelete and view deleted revisions. When the community last discussed this same topic (unbundling the view-deleted permission, which would be the exact same thing as you are proposing in practice) the WMF vetoed it because they feared action from the United States Congress.

Flagged revisions/pending changes are completely different because they are 100% viewable by the public by simply going to the revision history. They are not hidden. They are just not rendered on the front facing article. All anyone has to do is hit "view history", and the changes are all there to see. I just tested this on these revisions that are currently pending while logged out, and I saw them just fine (anyone can test this themselves by simply going to Special:PendingChanges, picking a diff, logging out, and viewing it). You are proposing the one line in the sand that the WMF has drawn re: unbundling permissions. There is no difference between "content not suitable for the public but only certain logged in users can see" and "deleted content". Pending changes is not near equivalent. TonyBallioni ( talk) 02:32, 17 March 2018 (UTC) reply

About a year ago, I took an extended look at how many times we AfC reviewers accepted a submission for publication, compared to how many times we declined to accept them. Over a two-month period, I tabulated about 1,500 such decisions, finding that about one in six of our decisions were "accept". Of course, the percentage of "accept" decisions is not quite the same as the percentage of submissions that were accepted, because many submissions receive multiple "declines". Then again, the interactive nature of the AfC process means that quite a few "accepts" had earlier been "declines". Also, my method of tabulating the data could not catch any submission that was speedily deleted (i.e., CSD'd) prior to a review decision. But still, the rate at which AfC submissions get accepted for publication is substantially higher than the small percentages that are being bandied about in this discussion.
If anyone cares to take a snapshot look at the current numbers, the list of submissions accepted for publication within the last 36 hours is here; the similar list for declines is here. NewYorkActuary ( talk) 21:14, 14 March 2018 (UTC) reply

@ NewYorkActuaryThanks for that. Note that if comparing numbers, you shouldn't necessarily count articles reviewed by Legacypac or myself, as we have been reviewing them after performing WP:BEFORE and largely accepting based on NPP guidelines for acceptability (sources exist) rather than AfC ones (sources are in the article). Looks like ~290 declined to ~30 accepted, not counting me and LP (~10% accepted). This seriously calls into question the WMF's claim of 1.2% promoted (essentially 2.4% given that only half of drafts ever see AfC). Considering that articles may receive multiple 'declines', but can only ever get one 'accept', these numbers are not even close to similar. @ Kaldari& Nettrom, We need more info on these draft promotion statistics to understand why they are so wildly different. — Insertcleverphrasehere ^{(
or here)} 21:37, 14 March 2018 (UTC) reply

1.2% survival on Draft is either dead wrong or wildly misleading. User:Legacypac/CSD_log shows I CSD thousands of Drafts. And I know we don't kill almost 99% of all Drafts because I personally promote/redirect/postpone more than 1% of the ones I look at up for G13. I also promote to mainspace (or others later promote) way way more than 1.2% of the AfC submissions I look at. I'm wondering if the G13 expansion cleanup a bit over 6 months ago is throwing off the numbers. We axed over 6,000 old non-AFC drafts in that effort over several weeks. The calculation is based on 529 accepted AfC drafts in 3.5 years, which is 1 every 3 days which can not be correct. Way too low. Legacypac ( talk) 21:56, 14 March 2018 (UTC) reply
- Sounds like this needs more research. Kaldari ( talk) 22:02, 14 March 2018 (UTC) reply

User:Fram could comment on the dump of SvG pages into draft from mainspace. Most of them were eventually mass deleted. Even there more than 1.2% were promoted back to mainspace. Legacypac ( talk) 23:24, 14 March 2018 (UTC) reply

I'll dig into this tomorrow and report back. Regards, Nettrom ( talk) 23:49, 14 March 2018 (UTC) reply

There are more than 80,000 accepted (and tagged) AfC submissions (see Category:Accepted AfC submissions), so having only 520 accepted in the 3+ years until 2017 seems wildly improbable. The dataset starts at 1 July 2014: for that date alone, we have Category:AfC submissions by date/01 July 2014 32 accepted submissions. However, many of those were not in the draftspace, but in Wikipedia:Articles for creation... Even so, we have for this single day 1 2 3 4 5 (since merged) 6 7 8 9 10 11 12 13 14 15 16 17 18 articles moved from Draft space to mainspace. Which means, assuming that this is an average day, that the number of 520 accepted articles would be reached after 1 month, not after 42 months. Even when looking at more recent months like Category:AfC submissions by date/September 2016, we get a few hundred accepted submissions from the draft space (AfC space wa no longer in use then). I think it would be best if this study was rechecked thoroughly, as the numbers seem way off. Fram ( talk) 08:04, 15 March 2018 (UTC) reply

I want to say now, having been doing NPP extensively since winter 2015, that ACTRIAL has been a major step forward–a huge improvement. NPP previously was an interminable drama sink: users with no understanding of notability and a belief that Wikipedia is a medium for untrammelled self-expression created endless articles on utterly non-notable topics and then got into fights and attacks on more experienced contributors as they got deleted. Creating a system where new editors have to understand that Wikipedia is collaborative and has minimum standards for articles has been a positive step in creating a better culture on Wikipedia for helping new editors socialise and develop their skills at tasks like adding citations, and also massively reduced the workload our limited number of experienced contributors have to do to screen out vandalism and hoax articles. ACTRIAL should definitely be made permanent. Blythwood ( talk) 20:28, 15 March 2018 (UTC) reply

Overall, I think it's clear that ACTRIAL was a success, though the continued backlog at AFC is problematic. I'd feel more comfortable encouraging new users to create pages through AFC if the backlog was around 1 week. I do suspect the percentage of accepted drafts is actually closer to 15% than 1.5%, and for non-promotional editors I assume it would be even higher. As far as Sj's comment regarding a dynamic spectrum of [highlighted] to [hidden] articles: I'd love to see (for example) the ability to add content about any book on its Special:BookSources page without the requirement that the book be "notable". But that's not really relevant here. Articles on non-notable private persons, articles promoting non-notable people, and articles promoting new non-notable corporations/organizations should still be removed from the project entirely, and that's the vast majority of declined drafts. power~enwiki ( π, ν) 20:34, 15 March 2018 (UTC) reply

It strikes me as somewhere between 'fine' and 'beneficial' in the long run to expressly make space for freely-licensed draft articles on non-notable topics of all kinds, as long as those a) do no harm, b) don't amplify the google rank of those topics, c) aren't visible to the average reader. Any topic can one day be notable; and the fact that someone wants to write a sourced, freely-licensed article about that topic is worth preserving against that eventuality. [that's what makes it "fine"].

Being able to draft a first article (without a fight and aggressive complaints and criticism) is also a great way to become a Wikipedian; even if the first article doesn't fit all current policies, or is't made globally visible. The ability to say "that's interesting, but not notable; come back once this topic crosses one of the following bars" is 100x better, more accurate, and more welcoming than having to say "demonstrate how you meet one of these criteria, or you're abusing our site and we will delete all trace of your work." Allowing people to gradually work on topics over time without having to immediately win a deletion debate may measurably improve retention, public image, and the % of developers who think it worthwhile to enhance MediaWiki rather than building their own knowledge platform elsewhere. [so, "beneficial"] – SJ + 23:28, 16 March 2018 (UTC) reply

It will be interesting to see what Nettrom and Kaldari find on actual AfC acceptance rates; as others have pointed out 1.2% seems unlikely. It looks like number of AfC submissions per day is a low 3-digit number and number of acceptances in a low 2-digit number. That would put the acceptance rate somewhere roughly around 10%. But, many of our AfC submissions are actually resubmissions so that's going to make the acceptance rate look lower than it actually is.

There is a problem in AfC with reviewers reluctant to approve underdeveloped submissions on potentially notable topics. Fundamental acceptance criteria is that accepted drafts should not be WP:LIKELY to be deleted. Since the WP:LIKELY threshold is just 50%, if reviewers were to take this to heart, we'd see a healthy number of deletions of accepted drafts. This is another good topic for additional research, but I beleive few accepted AfC submissions get deleted. A low deletion rate is to be both celebrated and cursed, in my opinion. In any case, if we were able to get AfC reviewers to let more marginal drafts through, based on my reviewing experience, I think we would less than double AfC acceptance rate so no huge numbers breakthrough there. ~ Kvng ( talk) 14:46, 16 March 2018 (UTC) reply

Legacypac and I (mostly Legacypac) have been collecting some data that supports what you are talking about. See User:Insertcleverphrasehere/AFC_stats. What we are seeing is about half of AfC drafts being notable topics, but many of these will be rejected as not being developed under the current AfC reviewer guide. I will be proposing some reform to the way that AfC deals with notable topics soon. My view is that if the decline is for notability concerns, AfC reviewers should do a search and if it is notable you should publish, unless it is CSD-able, regardless of the references included in the article (and slap a {{sources exist}} tag on it to prevent someone taking it to AfD). — Insertcleverphrasehere ^{(
or here)} 19:03, 16 March 2018 (UTC) reply

I wouldn't have guessed that 50% of submissions have such WP:N potential. I notice that a lot of entries in your AFC stats page are marked "borderline" so maybe you guys are even more generous than I on these assessments. Most borderline stuff about people and companies does not survive AfD these days. My guess, stated above, is that by enforcing AfC acceptance of notable topics, we could get twice as many AfC submissions to mainspace. So if we're currently accepting 1.2% we could get to 2.4%. If we're currently accepting 25% we could get to your 50% claim. It will be interesting to see what the actual current AfC acceptance number is.

I have tried to encourage more liberal acceptance at AfC but have hit limits as to what the community is willing to accept in terms of WP:NPOV, WP:CV and WP:PAID. This puts conscientious reviewers in the position of needing to do significant work on drafts before accepting them which is not what you want when there's limited manpower working on a backlog. ~ Kvng ( talk) 20:28, 16 March 2018 (UTC) reply

1.2% of what, exactly? And a bit of context

Thank you for all the comments on this thread! I'd like to try to explain what we calculated 1.2% of exactly, what we didn't measure, and try to put that number in context. I hope it's okay that I add a subsection for this and respond to some specific concerns inline below.

The report tries to make it clear how I calculated that 1.2%. Here's the quote: "the publication rate of pages created in the Draft namespace is incredibly low (about 1.2%)" The denominator is "pages created in the Draft namespace" (to the best of our ability, recreation of Wikipedia page histories is difficult and we use the Data Lake for data up until July 21, 2017; see also wikitech:Analytics/Data Lake/Edits#Limitations of the historical datasets). I gathered a dataset of 126,957 pages created in Draft between July 1, 2014 and December 1, 2017. I then had a script go through the edit history of all those pages to identify a move from Draft to Main, and found 1,550 pages had done so (and that's 1.2% of the total).

When I first encountered this, I thought there were errors in my dataset. I therefore did a sanity check by identifying how many pages were live in the Main namespace on enwiki as of February 1. The result was about 33% lower. Given that pages get deleted and/or moved back to Draft, getting a lower (but roughly in the same ballpark) number would make sense. I also interpreted that finding to mean that my result seemed reasonable.

We did not measure the AfC acceptance rate. As many have brought up, the number of accepted submissions at AfC is much higher. There are several things to note about the difference between these calculations. First, I used pages created between a defined timespan in our calculation, partly because I needed reliable deletion data, and partly because we used a consistent end date in our analyses. I do not know what timespan "accepted AfC submissions" covers. Second, there's a large number of accepted redirects (from Wikipedia:Articles for creation/Redirects, I expect). I made a SQL query that I ran a couple of days ago, finding that the number of non-redirecting accepted submissions was just shy of 50,000. Third, AfC submissions can originate from several places (e.g. Legacypac mentioned Most userspace submissions that are not test edits get moved to Draft.) We only studied pages created in the Draft namespace, partly because that allows us to mine edit histories of both live and deleted pages, and partly because we expected ACTRIAL to result in an increase in AfC submissions of that type of pages. It would've been great to also have looked at AfC submissions from the User namespace, as well as pages moved into Draft from the Main namespace.

This discussion has been a great learning experience for me, making it clear that I should be careful about statistics added to reports. They'll be scrutinized (which is good, btw), so I should make sure that I've done my due diligence. Insertcleverphrasehere asked us to clarify our numbers early in the thread, at which point I supplied some slightly different numbers, without explaining them very well. That did not make this discussion any clearer, and I'm sorry about that. They also asked Is it possible to crunch the numbers for the past 6 months and see what kind of percentage we are talking about? Using the same approach and looking at the first five months of ACTRIAL, the rate is 11.1% (ref this gist for those who want some details). I chose to not use pages created after February 15 to let there be some time for reviews and moves.

I also wanted to add a bit of context to the usage of the 1.2% statistic in the report. It was not part of our main study, so it is not part of the main results either; it's used in the part of report where we make some recommendations to the WMF. The main argument we're making is that creating an article is a difficult task, and that the WMF should look at design improvements of that process. In order to back that argument up a bit, or to exemplify that point, I decided to use two tangential statistics from our work. Due to where in the report this was, I did not expect the scrutiny this got, as I instead thought readers would focus on the main argument. In hindsight, I should've been more careful about throwing numbers around, and/or dug up some research papers to cite instead. Apologies for the wall of text response, but I hope it's been helpful in answering some of the questions about this number and how it came to be. I appreciate the feedback and scrutiny this has gotten, so please do ask follow-up questions as well. Cheers, Nettrom ( talk) 23:01, 16 March 2018 (UTC) reply

First off, we need to pull the 1.2% number from the report until we can get to the bottom of this. People are getting excited about it and are building prejudices about AfC based on it. As you clarify here, 1.2% is not a direct measure of AfC, it is about draft space. AfC is not the only use of draft and draft is not always used for AfC. Several of us have shown that 1.2% does not pass some basic sanity checks. We need to decide whether what you've measured here is what we want to be measuring. If not, we need to decide what we want to be measuring. If so, we need to take a close look at the methods here to resolve some of the failed sanity checks - your 1,550 draft to mainspace moves over 1,249 days seems particularly suspicious ( [1] would lead us to expect a number at least 10 times higher). ~ Kvng ( talk) 16:22, 17 March 2018 (UTC) reply

It seems that there are two primary conclusions based on Nettrom's re-analysis: 1) that most AfC drafts undoubtedly do not start in the draft space (probably mostly starting as sandbox drafts, but some in mainspace too). and 2) that for those drafts that start in the draft space, the AfC acceptance rate seems considerably higher during ACTRIAL; there seems to have been a lot more well put together drafts being created in the draft space over the last six months, undoubtedly the result of new users unable to publish directly that are heading through the redesigned article wizard which makes draft creation super streamlined. This is very good news, and indicates that we are perhaps saving many of the 20% or so of acceptable articles that used to be created by new editors.

The next thing that we need is to determine the actual AfC acceptance rate, based on all drafts submitted to it, regardless of where they originated from. If we can get this data from the 6 months before ACTRIAL, and during the 6 months of ACTRIAL, this would give us a very good idea of how ACTRIAL has impacted AfC. I feel that this research should be given a very high priority, as it may have a direct impact on the decision to permanently implement autoconfirmed article creation. — Insertcleverphrasehere ^{(
or here)} 20:10, 17 March 2018 (UTC) reply

It sounds like you propose that AfC acceptance rate should be measured over some period as Number of articles moved from AfC submission queue to mainspace⁄Number of unique articles submitted to AfC queue. This works if the period is long compared to the backlog time and I think your proposed 6-month period satisfies that. If we want to do finer time-based measurements, we need to more closely examine the fate of each AfC submission. There is also the question of whether we want to count incomplete submissions in the denominator. We do have authors start and save a draft but never submit it for review. I suggest we don't count these because I think we want to be able to do an apple-to-apples comparison and I'm sure over in the NPP side we have authors who click a redlink to start a new article but never hit the save button. ~ Kvng ( talk) 20:33, 17 March 2018 (UTC) reply

Term examined

I am concerned that only the first two months of data were examined. There are six months of data and no analysis has been done to validate that the two months examined are representative of the entire trial period. Jbh ^Talk 14:12, 14 March 2018 (UTC) reply

I'll let Nettrom comment on the implications of only looking at the first two months. There are several reasons why we chose to do that though:

We wanted to make sure there were no serious effects on new editor retention without waiting 6 months to discover that (in case there was a good reason to end ACTRIAL early).
Several metrics (mostly related to editor and article retention) can't be effectively analyzed until at least a month after the time period being looked at (and we suspected the community would not be keen on waiting a month after the end of ACTRIAL to see an analysis).
Nettrom's opinion was that we should be able to see any important changes within the first two months data.

That said, all the data and even the scripts for analyzing the data are publicly available, so if anyone feels like updating the analysis with more info, they are welcome to do so. I don't expect they will find any major differences though. Kaldari ( talk) 17:39, 14 March 2018 (UTC) reply

Stopping by here to second what Kaldari mentions. Prior to the start of the trial, we discussed its length with the community (that discussion can now be found in the ACTRIAL talk page archives). I argued that we didn't need six months, because given how large the English Wikipedia is the effects will be quickly apparent. That being said, during the analysis I have seen no indication that the first two months of ACTRIAL are atypical of Wikipedia. Do you have a reason to believe the last four months would be different? Regards, Nettrom ( talk) 20:53, 27 March 2018 (UTC) reply

@ Nettrom: Thank you for your response. I do not see how you would see whether the first two months were representative or not since the other four months were not examined -- the only way to tell that is to examine the data. So, yes, it is likely that examining the whole trial period will either reinforce clarify trends from the first months or indicate aberrations. Wikipedia is a social institution and people adjust to new things over time, not as if a switch is flipped. I suspect it would take several months for a new editing equilibrium to become apparent. The first two months will barely capture the perturbation of the system.

In the report it is mentioned that some of the results may have been related to ACTRIAL or possibly an influx of students. In several areas it was mentioned that the effects were minor ie within the noise. A longer sample period allows those trends to be better examined. In the case of article creations/deletions only now are the first draft articles coming up for examination for G13 so the deletion numbers will change — to get apples-to-apples numbers I would suggest pulling a sample of draft articles and apply the same criteria which would have been applied had they been created in article space.

Beyond that the full six months will give some indication of how people adjusted over time. Do more or less newly created accounts create articles not that they are drafts? Was bad article creation merely shifted to draft or did it decrease? This would give some indication of the long term increased loading of AfC; Did people stop creating crap once they could not publish it? If so, over the term of the trial did bad article creation remain the same or did the population learn that it was pointless to even write the bad articles?

I simply can not imagine how any question to which ACTRIAL is relevant would not benefit from the examination of a longer time series. Jbh ^Talk 21:36, 27 March 2018 (UTC) reply

Can someone please extend this graph to show the full 6 months of data. Two months does not appear to be enough here to accurately understand the implications for AfC. ~ Kvng ( talk) 14:26, 17 March 2018 (UTC) reply

Maybe Nettrom could help. Kaldari ( talk) 02:55, 24 March 2018 (UTC) reply

@ Kaldari and Nettrom: I've downloaded the code and had a look at running it, but it appears to depend on database access. Does it have to be run on the WMF labs, or is it possible to run it remotely? Thanks. Mike Peel ( talk) 17:58, 24 March 2018 (UTC) reply

@ Mike Peel and Kvng: If I can get a bit of time, I'll make a small improvement to the script so it identifies withdrawn submissions correctly, then run it to get data up until the end of the trial. Might be able to do it this week, but expect it by the end of next week at the latest. I'm fairly sure the code can be rewritten to not require database access, but some of the things it does is faster with it. It also requires an account with access to deleted revisions from the API. Cheers, Nettrom ( talk) 20:53, 27 March 2018 (UTC) reply

@ Nettrom: Any news? It'll be too late to affect the RfC, since that's just been closed, but it would still be good to see the results. Thanks. Mike Peel ( talk) 14:47, 18 April 2018 (UTC) reply

@ Mike Peel: Thanks for keeping an eye out and pinging me about this! Sorry that it's delayed, I've had to prioritize other tasks but am picking this up again now. I created T192574 to track the progress on this and make it clearer what the process is. Let me know if there are any questions or concerns. Hopefully this should be done in a couple of days. Cheers, Nettrom ( talk) 17:15, 19 April 2018 (UTC) reply

Filtering out Wiki Education student editors

"One thing to keep in mind is that during this period, September to November, we typically see an increase in retention, likely due to the school cycle." If anyone goes forward with an analysis of the entire trial period, it would probably be useful to filter out Wiki Education student editors from the data. (I can provide a list of usernames.) We had about 6000 new users who were students doing Wiki Education-supported assignments during that period (and will have about that many again for early 2018), about 15-30% of whom I estimate would have met the 'survival' metric. Very few of them create drafts, as we guide them them to start in sandboxes and the ones who create new articles end up autoconfirmed by the time they want to move to mainspace; ACTRIAL ended up having very little impact on our program, but removing the student editors from the data might help isolate the effects the trial looked for.-- Sage (Wiki Ed) ( talk) 20:22, 14 March 2018 (UTC) reply

Challenging an assumption

Here Wikipedia:Autoconfirmed_article_creation_trial/Post-trial_Research_Report#Less_low-quality_content_in_article_space WMF staff talks about increased deletions and says "This increase in deletions is not commensurate with the increase in draft creations, meaning that we see a lot of created drafts that appear to not warrant deletion."

This is not correct - it ignores timing. In Mainspace a delete worthy page will most likely get caught in NPP amd deleted within 90 days (even with the backlogs). In Draft page will not typically face deletion until at least 180 days as a G13 or much longer if it goes into the AfC review, waits for reviews 3 or more times, gets some bot or AWB or disambig edits etc. You can somewhat see this by looking at dates created on this list User:MusikBot/StaleDrafts/Report. In other words, the increased unencyclopedic content created in Draft instead of mainspace during ACTRIAL is mostly still sitting with no move yet to delete it.

It's not that the Drafts do "not warrent deletion" it is that they are not old enough to delete G13, no one is even looking at the non-AfC submitted Drafts yet, and even if we wanted to review those pages Gx CSD criteria is way narrower than Ax CSD criteria is. There is no PROD for Drafts. Further some editors insist that we can't look at Notability in MfD (see WP:NMFD) so sending lots of garbage there will not be accepted. Even editors who are happy to debate notability at MfD prefer to let G13 catch the junk later because there is so much junk, trying to debate it all is hopeless.

I 100% agree that a shift of new creations to Draft space has happened, but that is a wonderful thing. Draft space is no index and it has a clock where an Admin human will look at every Draft page in 6 months or so unless someone first deletes or promotes it. Legacypac ( talk) 22:46, 14 March 2018 (UTC) reply

The drafts "do not warrant deletion" because they're drafts where the criteria for deletion are more lax. If some of that crap were in mainspace, it would be subject to speedy deletion (A7, some even delete on sight worthy) or prodded. This is a deficiency in policy; irredeemable garbage intended for mainspace should be subject to speedy deletion wherever it resides. MER-C 22:14, 15 March 2018 (UTC) reply

@ Legacypac and MER-C:, absolutely, and the place to give more impact to your comments is at Wikipedia_talk:The_future_of_NPP_and_AfC. Kudpung กุดผึ้ง ( talk) 01:16, 17 March 2018 (UTC) reply

Isn't this line of thinking -- that drafts should be deleted if not worthy of mainspace -- generally ignoring the point of drafts as unfinished works? I've had drafts on perfectly reasonable, notable topics where I simply didn't finish the article and didn't come back to it deleted (and I've been an editor for long enough that I often come back to things after years), which is quite different from "irredeemable garbage." I don't have a sense of how many drafts fall into which category -- spam versus non-encyclopedic topics versus unfinished articles -- but it doesn't offend me nor can I see how it harms the encyclopedia to simply let userspace/Draft space drafts [not drafts actively submitted to AfC, but pages in general draft space] alone, assuming they are not actively harmful to anyone (per @ Sj:'s comments above). Noindex solves pretty much the only real problem I can imagine there being to not deleting drafts. -- phoebe / ( talk to me) 03:49, 17 March 2018 (UTC) reply

@ user:phoebe you are talking about WP:G13 expansion which was widely advertised and overwelmingly approved. Drafts are a temporary workspace not a holding pen for musings, spam, non-notable topics and unattributed alternative copies of mainspace. If you want a page back see WP:REFUND. Legacypac ( talk) 20:47, 17 March 2018 (UTC) reply

Thanks

I would like to express my personal thanks to Danny, Kaldari, and their team for the enormous effort on this research which I consider to be one of the most objective and useful in recent times. ACTRIAL has been an uphill challenge to me all these years to get it it carried out and I'm enthralled not only by the data it has produced, but also by the resounding success it has been. This kind of work helps close the gap between the Community and the Foundation. I hope that the recommendations will be carried through and I look forward to collaborating closely with the WMF on future developments. Again, my heartfelt thanks. Kudpung กุดผึ้ง ( talk) 19:37, 15 March 2018 (UTC) reply

WP:ACREQ started

I've started a page at WP:ACREQ as a centralized place to assemble the case for making Auto-Confirmed REQuired for new mainspace page creations. This should save us typing the same case over and over and be useful when we run the RfC. We can also collect endorsements from interested editors. Legacypac ( talk) 02:24, 17 March 2018 (UTC) reply

ACTRIAL and editathons

A note that though it may not be the biggest use case, I and others who run editathons and other events designed to get new editors started have struggled with how to make ACTRIAL work in a setting where new editors want to create articles and are being guided and trained to do so. (For one thing, the fact that new users could no longer create articles was a surprise to many of us, leading to lots of confusion midstream in major events.) Given editors at editathons are already generally having their work checked by others, I suppose the most efficient workflow if ACTRIAL stays in place would be to have them write in draft space and have a trainer or experienced editor move the articles to mainspace there and then; but it would be helpful to have that documented. That workflow also makes a fair number of assumptions about the capacity of the trainer in any given event. -- phoebe / ( talk to me) 04:01, 17 March 2018 (UTC) reply

phoebe, one of the problems is that some of the main edithathon facilitators are not admins. From what I know about adminship, they would easily pass RfA if they wanted to. That said, if up coming editathons were to be more widely published where it matters, plenty of admins would be available on line during the event. This is how it works for the WP:Women in Red editathons, and it would avoid the recent problems created by the recent editathon in in South Africa. No editathon participants want to see their efforts tagged for deletion as being inappropriate for an encyclopedia, but that's what happened. Our Wikimanias which take place on a different continent every year are a golden opportunity to present talks about recruiting for genuine new editors, but I don't see any major presentations that focus on editathons. I co-facilitated an editathon at Wikimania in London and as the steering Wikipedia for ACTRIAL for years, it crossed my mind even then, but I could not see how ACTRIAL would be a serious restriction. And it hasn't been. Kudpung กุดผึ้ง ( talk) 06:52, 17 March 2018 (UTC) reply

@ Kudpung: Indeed -- I am a former WMF board member, and I've facilitated dozens of events and trained hundreds of new editors, but I am not an en.wp admin (nor have I ever been). That's simply not something we can or should expect of editathon trainers. As for presentations, there have been several dozen talks and focused tracks about editathons at various Wikimanias, including the latest one (which I ran the program for, incidentally, so I do know); there are many, perhaps most, events that are not run by the audience of 1,000 people or so that go to Wikimania, though. -- phoebe / ( talk to me) 04:01, 18 March 2018 (UTC) reply

Moving pages at events

This was discussed before ACTRIAL in detail. Since new editors are working in Draft anyway the only actual restriction is that non-auto confirmed users can't use their own account to move their creations to mainspace. Consensus at an RFC was against creating a userright for trainers that would allow then to AC new users directly. Some users felt strongly that editthon users should not get special treatment at all.

There are a number of work arounds including:

Request users create an account 5 days before the event. One event leader reported 100% success with this request.
Have an experienced editor at the event move the pages after checking them. This can done on any device logged in as a user with appropriate rights and the new users can follow along and learn from it
Have an experienced editor anywhere in the world monitor the drafts off an event page and move them as appropriate.
Use WP:AFC
Have new users make a request at WP:PERM for confirmed status, specifying they are at an event and naming the editor running the event. Singlce hopefully training includes the importance of both maintenance and content creation, new users should learn about PERM anyway so later they can request NPP or other user rights.
Have a physically or virtually present Admin give out Confirmed status for event participants.
Event coordinators are encouraged to seek Adminship. There is a shortage of active admins and users willing to apply, so why not give tools to our dedicated trainers.

Remember new users don't know that before ACTRIAL they could have made the move themselves. Don't make it a negitive thing, present the restriction as a positive small control Wikipedia has to discourage vandals and spammers. Everyone will appreciate the need to discourage bad actors. Legacypac ( talk) 05:16, 17 March 2018 (UTC) reply

I fail to see how ACTRIAL trial could have come as a surprise to anyone at all. It's a major policy change and it's never ceased to be widely discussed since it was originally suggested in 2011 and I made enough noise about it at every Wikimania I have attended. All the CEs of the WMF and C-Level I have spoken to have been fully aware of it including the Founder and most Board members . Kudpung กุดผึ้ง ( talk) 06:37, 17 March 2018 (UTC) reply

Remember, that for policy discussions these days, even the major ones, "widely discussed" means among a few hundred editors active in a particular kind of space at best. Many many hundreds more editors who don't participate in these spaces run and participate in events and teach Wikipedia. -- phoebe / ( talk to me) 04:04, 18 March 2018 (UTC) reply

phoebe, then WADR, they shouldn't be running them if they can't be bothered to stay abreast of major developments on the encyclopedia they represent, all they need to do is log in from time to time. I'm one of the most policy-active users on en.Wiki and I created and/or co-created several of them, including the draft namespace. I also shepherded the ACTRIAL for years to its implementation by the WMF while also taking into account any possible concerns of the WiR and event organisers. The RfCs were very widely notified and seen by 1,000s of users. Whether they vote on them is their preogative, but any consensus (for or against) will be carried. A lot of effort, time, and money has been spent on this WMF supported one which is, or will be, the most major change in policy for years. Please see the WMF report and do consider joining the pre-RfC discussion at Wikipedia talk:Autoconfirmed article creation trial while there is still time. Kudpung กุดผึ้ง ( talk) 08:13, 18 March 2018 (UTC

AfC backlog "struggle"

Concern about an increase in AfC backlog and associated "struggle" is mentioned in the lead, introduction and Shift in content creation and review and Suggestions to the Wikimedia Foundation sections. The WMF team has indicated they have limited understanding of AfC and are now working to gain better understanding. AfC has a long history of backlogs. Past history needs to be examined and we probably need to look at the full 6 months of ACTRIAL data to understand the actual impacts here. I beleive the hand wringing about AfC backlog is unsubstantiated and possibly even non-NPOV. I propose to remove "struggle" discussion from the lead, introduction and and Suggestions to the Wikimedia Foundation section and flag any unsubstantiated statements in Shift in content creation and review. ~ Kvng ( talk) 15:22, 17 March 2018 (UTC) reply

+1 to that. ACTRIAL is not the main reason for increased AfC backlog. Other reasons include:

During almost the entire trial 18 June 2017 - March 4, 2018 I was restricted to submitting good drafts to AfC to get them from stale userspace to mainspace. I also reviewed many good drafts but could not approve them. I'm working through the list now and approving pages I already tagged as good as I find them again
We lost two very prolific reviewers to socking blocks
- SwisterTwister on 28 December 2017
- DrStrauss on October 30, 2017

With such a limited number of really active reviewers, the loss of three of us hurt the backlog. However, the true backlog is not two months. Most pages get processed quite quickly as covered here WP:ACREQ. It's the borderline and complex topics that pile up. We are working on a plan to accept pages that are notable but not perfect. Legacypac ( talk) 15:35, 17 March 2018 (UTC) reply

Also the entire AfC backlog is only 2400-2500 pages, much less than the much reduced NPR backlog. Legacypac ( talk) 15:41, 17 March 2018 (UTC) reply

@ Legacypac: Can you edit your post here and and attach dates to these AfC issues? This will help us correlate with features we may find in the AfC data. ~ Kvng ( talk) 16:00, 17 March 2018 (UTC) reply

I give anyone permission to edit in this info. I'll try though. Legacypac ( talk) 16:03, 17 March 2018 (UTC) reply

Backlog is dropping below 2200 now. Legacypac ( talk) 14:57, 19 March 2018 (UTC) reply

Please tell the WMF that the best way to understand the AFC struggle is to write a new article, submit it, shepherd it through the process, and experience the whole thing for ones' self. There is really no substitute for that. 173.228.123.121 ( talk) 06:26, 24 March 2018 (UTC) reply

Well, that's a different struggle and it is discussed Above. ~ Kvng ( talk) 14:56, 24 March 2018 (UTC) reply

Article growth

One of the hypotheses was that The rate of article growth will be reduced. This was found to be unsupported. Why is this not mentioned as a finding or key finding in the lead or introduction of the report?

I attempted to add this myself but my change was reverted by Nettrom because, "content changes can be discussed on the talk page, the overview corresponds to the three main areas of the study." Well, the report says, "Our study is organized into three themes corresponding to the main findings above." Perhaps I misunderstand the comment or context but this sounds circular. ~ Kvng ( talk) 20:11, 19 March 2018 (UTC) reply

I think I understand where Nettrom is coming from; this data is not expounded upon in the body of the report, possibly due to some uncertainty in the validity of the results. 20% (the percentage of new editor creations that used to survive) of the 300 or so articles created by new editors is only 60 or so articles per day that we should expect to survive long term. 60 articles seems to be well within the level of noise and uncertainty, and difficult to separate from other trends for raising and falling article creation. As a result, based on the data gathered, the WMF may be able to say that their hypothesis that "the rate of article growth will be reduced" was not supported, but this does not mean that the reverse hypothesis is true ("the rate of article growth did not reduce"). Realistically, the number of articles by new users that used to survive deletion is so low and the noise in the data so high that we will probably never know if it was slightly reduced from what it would have been without the trial. It is even possible that article creation was higher during the trial than it would have been without the trial (due to more available time for reviewers and admins to work on projects of their own), but again, we can't know because there is too much uncertainty in the data. — Insertcleverphrasehere ^{(
or here)} 20:32, 19 March 2018 (UTC) reply

I don't know if that is just your own speculation. The write up in the body doesn't hint at any of this. It speaks confidently that article growth didn't change because stuff that would have been deleted was never created. I'm pressing this issue because not creating stuff that later needs to be deleted is a big win for everyone especially when this is done without impeding creation of stuff we want. ~ Kvng ( talk) 20:43, 19 March 2018 (UTC) reply

A reduction in growth rate was never going to be statistically significant, and the WMF report sort of says this when it points out that most of the articles created by new editors were deleted anyway. Given this, I understand why Nettrom says that no change in article growth wasn't a major finding. This does not mean that it isn't a good idea to point out that we haven't had any huge drop in the growth rate, as some people seem to think based on comments elsewhere. I don't think we need to worry overly, as it seems that this is not a big sticking point in the RfC anyway. — Insertcleverphrasehere ^{(
or here)} 21:34, 19 March 2018 (UTC) reply

Yes, I think people are looking directly at the data or going on their direct experience during the trial in deciding whether to support permanent ACTRIAL. All valid stuff. This report is not a big part of the discussion right now. ~ Kvng ( talk) 21:45, 19 March 2018 (UTC) reply

@ Kvng: The main reason why the report does not list the result for Hypothesis 15 in the overview and introduction is that the report follows the structure of our research questions. There are three of those, so we have three key points in the introduction and overview. In both cases, we try to summarize the findings on a higher level rather than point out the results of a specific hypothesis. At the end of both sections, we refer to the research page on meta where interested readers can get more details. Cheers, Nettrom ( talk) 20:40, 20 March 2018 (UTC) reply

When will the 6-month report be ready?

Given the potential impact of this, a preliminary 2-month report is not enough. When will the 6-month report be available? Thanks. Mike Peel ( talk) 01:26, 24 March 2018 (UTC) reply

as discussed elsewhere, there will not be any more report. Legacypac ( talk) 18:24, 24 March 2018 (UTC) reply

Just adding a reference to the discussion thread on the RfC talk page where this is also discussed, in case anyone wonders where "elsewhere" is. Cheers, Nettrom ( talk) 20:10, 27 March 2018 (UTC) reply

Our suggestions

It's great to see this report. But the headline number seems to be: 1.2%

1.2% of what, exactly? And a bit of context

Term examined

Filtering out Wiki Education student editors

Challenging an assumption

Thanks

WP:ACREQ started

ACTRIAL and editathons

Moving pages at events

AfC backlog "struggle"

Article growth

When will the 6-month report be ready?

Our suggestions

It's great to see this report. But the headline number seems to be: 1.2%

1.2% of what, exactly? And a bit of context

Term examined

Filtering out Wiki Education student editors

Challenging an assumption

Thanks

WP:ACREQ started

ACTRIAL and editathons

Moving pages at events

AfC backlog "struggle"

Article growth

When will the 6-month report be ready?

Videos

Websites

Encyclopedia

Facebook