This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | Archive 4 | Archive 5 | Archive 6 | Archive 7 |
I want your thoughts on the practice of creating a draft article with AI as a starting point before verifying all points in it. I see this as a potentially useful strategy for making a well flowing starting point to edit. Immanuelle ❤️💚💙 (talk to the cutest Wikipedian) 18:13, 3 April 2023 (UTC)
AI generated articles show up a lot on ANI. I think it might be helpful to add a dedicated noticeboard for this stuff. IHaveAVest talk 02:01, 4 April 2023 (UTC)
The Terms of use for ChatGPT
[1] say As between the parties and to the extent permitted by applicable law, you own all Input. Subject to your compliance with these Terms, OpenAI hereby assigns to you all its right, title and interest in and to Output. This means you can use Content for any purpose, including commercial purposes such as sale or publication, if you comply with these Terms.
In threads above there is a link to a Sharing & publication policy [2] with attribution requirements. It's not clear to me whether this in generally in force. I think it may be meant for invited research collaboration on products that aren't yet publically available. Sennalen ( talk) 16:47, 7 April 2023 (UTC)
This doesn't go far enough: "LLM-generated content can be biased, non-verifiable, may constitute original research, may libel living people, and may violate copyrights." LLMs also blatantly falsify both citations (creating plausible-looking cites to non-existent sources) and quotations (making up fake quotes from real sources). — SMcCandlish ☏ ¢ 😼 09:28, 13 April 2023 (UTC)
This section might be the part I have the most issues with. It uses "in-text attribution" incorrectly, and it requires the use of {{
OpenAI}}
for OpenAI LLMs, which blurs the line between OpenAI TOS and Wikipedia policy (and doesn't comply with OpenAI's ToS anyway). I also don't think we've reached a consensus on how to attribute, or resolved the outstanding issues involving inline attribution, whatever we go with.
DFlhb (
talk) 03:31, 1 April 2023 (UTC)
If we are going to attribute content added with the assistance of LLM, there are two things to keep in mind:
As there are legal implications to both of these, that means that regardless of our discussions here or what anyone's opinion or preference here is, the ultimate output of any policy designed here, must be based on those two pillars, and include them. Put another way: no amount of agreement or thousand-to-one consensus here, or in any forum in Wikipedia, can exclude or override any part of the Terms of Use of either party, period. We may make attribution requirements stricter, but not more lax than the ToU lays out. (In particular, neither WP:IAR nor WP:CONS can override ToU, which has legal implications.)
I'm more familiar with Wikimedia's ToU than ChatGPT's (which is nevertheless quite easy to understood). The page
WP:CWW interprets the ToU for English Wikipedia users; it is based on Wikimedia's
wmf:Terms of use, section
7. Licensing of Content, sub-sections b) Attribution, and c) Importing text. There's some legalese, but it's not that hard to understand, and amounts to this: the attribution must state the source of the content, and must 1) link to it, and 2) be present in the edit summary. The
WP:CWW page interpretation offers some suggested boilerplate attribution (e.g., Content in this edit was copied from [[FOO]]; see that article's history for attribution.
) for
sister projects, and for outside content with compatible licenses. (One upshot of this, is that *if* LLM attribution becomes necessary, suggestions such as one I've seen on the project page to use an article-bottom template, will not fly.)
Absent any update to the WMF ToU regarding LLM content, we are restricted only by the LLM ToU, at the moment. The flip side of this, is that one has to suspect or assume that WMF is currently considering LLM usage and attribution, and if and when they update the ToU, the section in any proposed new LLM policy may have to be rewritten. The best approach for an attribution section now in my opinion, is to keep it as short as possible, so it may be amended easily if and when WMF updates its ToU for LLMs. In my view, the attribution section of our proposed policy should be short and inclusive, without adding other frills for now, something like this:
Once WMF addresses LLMs, we could modify this to be more specific. (I'll go ask them and find out, and link back here.)
We may also need to expand and modify it, for each flavor of LLM. Chat GPT's sharing/publication policy is quite easy to read and understand. There are four bullets, and some suggested, "stock language". I'd like to address this later, after having a chat with WMF.
Note that it's perfectly possible that WMF may decide that attribution to non-human agents is not needed, in which case we will be bound only by the LLM's ToU; but in that case, I'd advocate for stricter standards on our side; however, it's hard to discuss that productively until we know what WMF's intentions are. (If I had to guess, I would bet that there are discussions or debates going on right now at WMF legal about the meaning of "creative content", which is a key concept underlying the current ToU, and if they decide to punt on any new ToU, they will just be pushing the decision about what constitutes "creative content" downstream onto the 'Pedias, which would be disastrous, imho; but I'm predicting they won't do that.) I'll report back if I find anything out. Mathglot ( talk) 03:51, 10 April 2023 (UTC)
...you warrant that the text is available under terms that are compatible with the CC BY-SA 3.0 license (or, as explained above, another license when exceptionally required by the Project edition or feature)("CC BY-SA"). ... You agree that, if you import text under a CC BY-SA license that requires attribution, you must credit the author(s) in a reasonable fashion.It gives attribution in the edit summary as an example for copying within Wikimedia projects, but doesn't prescribe this as the only reasonable fashion. Specifically regarding OpenAI, though, based on its terms of use, it assigns all rights to the user. So even if the U.S. courts one day ruled that a program could hold authorship rights, attribution from a copyright perspective is not required. OpenAI's sharing and publication policy, though, requires that
The role of AI in formulating the content is clearly disclosed in a way that no reader could possibly miss, and that a typical reader would find sufficiently easy to understand.
The attribution requirements are sometimes too intrusive for particular circumstances (regardless of the license), and there may be instances where the Wikimedia community decides that imported text cannot be used for that reason.In a similar manner, it may be the case that the community decides that enabling editors to satisfy the disclosure requirement of the OpenAI sharing and publication policy is too intrusive. isaacl ( talk) 04:52, 10 April 2023 (UTC)
DFlhb had removed as a part of their reverted trim, and now I've removed it again. This topic is covered in wmf:Terms of Use/en. No useful specific guidance was provided here. There's no agreement that a policy needs to require use of Template:OpenAI as it is not obviously compatible with OpenAI ToS requirements. Editors advocating to include specific guidance about requiring attribution on this page should get consensus for the concrete version of text about this that they are committed to and want to see it becoming Wikipedia policy. — Alalch E. 11:41, 14 April 2023 (UTC)
It has been proposed to me on Wikisource that LLMs would be useful for predicting and proposing fixes to transcription errors. Is there a place to discuss how such a thing might technically be implemented? BD2412 T 23:00, 15 April 2023 (UTC)
BTW I created an article for LangChain which has a couple good starter resources. Sandizer ( talk) 04:36, 18 April 2023 (UTC)
The current text starts by granting the permission to use LLM text as a basis for discussion on Talk pages. But how is this ever going to be appropriate as part of the process of building an encyclopedia (as opposed to FORUM-esque discussion?). The prohibition was for using LLMs for 'arguing your case', but the problem I'm seeing is not just this, but people using them for random [3] contribitutions; and if they're using for closing RfCs/AfDs etc ... ? Arghh. I have tried to clarify. Bon courage ( talk) 02:06, 7 April 2023 (UTC)
you should not use LLMs to "argue your case for you" in talk page discussionsto
you must not use LLMs to write your commentswas in my opinion a pretty significant change in meaning here. To be clear, if a less-than-confident English speaker has a good argument, an argument of their own construction, should they be allowed to use an LLM to work out the phrasing and essentially have it "write their comment"? Or do we just say that competence is required and that editors who can't phrase their own arguments should not be on talk pages to begin with? PopoDameron talk 10:46, 8 April 2023 (UTC)
while you may include an LLM's raw outputs as examples in order to discuss them or to illustrate a point
This proposal was made by a sockpuppet of a blocked user, and it doesn't look like it's going anywhere. — David Eppstein ( talk) 01:39, 13 April 2023 (UTC) |
---|
The following discussion has been closed. Please do not modify it. |
I'm proposing merging Wikipedia:Using neural network language models on Wikipedia into Wikipedia:Large language models. I think the sections of the former would fit well into WP:LLM, and it would ease in the process of creating a new guideline on ChatGPT. – CityUrbanism 🗩 🖉 18:48, 10 April 2023 (UTC)
|
AI seems like a semiautomated (bot-like) tool to me, and I am missing the mention of already existing policies like WP:MEATBOT, or WP:BOTUSE on this page. I believe AI is a good thing to Wikipedia, I imagine AI to crawl through the stubs and expand the ones that are expandable. What I do not think is good for Wikipedia, if AI is used to generate thousands of stubs. Paradise Chronicle ( talk) 05:55, 21 April 2023 (UTC)
You must not use LLMs for unapproved bot-like editing, or anything even approaching bot-like editing. Using LLMs to assist high-speed editing in article space is always taken to fail the standards of responsible use, as it is impossible to rigorously scrutinize content for compliance with all applicable policies in such a scenario.— Alalch E. 11:37, 22 April 2023
For the ones who don't explicitly admit they use LLM the whole "policy" is useless. I'd prefer that an editing pattern which appears to be fall under LLM to be the focus of the policy. If editors refuse to admit that they use AI, it will be similar like MEATBOT or MASSCREATE, which are also hardly applied and only to define who actually enters in the two policies is a problem. Paradise Chronicle ( talk) 08:33, 23 April 2023 (UTC)
Yesterday I tried to make a script which uses an LLM to check whether an article's cited sources support its text. I'm still working on that, but it's quite a bit more of an undertaking than I first thought. In the mean time, this script will take the plain text of an article (with all references stripped), select a number of passages it thinks should have a source, use web search to try to find some, and in a small fraction of such cases, it does actually work. I'm leaving it here as a proof of concept for how such tasks might be approached. (@ BD2412: it's much easier this way than using LangChain, although I'm not sure it's exactly better because I had to add special cases for e.g. string cleanup and un-clicktracking search results, for which I think LangChain has some under-the-hood support.) Anyway, here it is:
Python code to use
Anthropic Claude and
DuckDuckGo to try to verify article claims, with sample output
|
---|
!pip install anthropic
import anthropic
from requests import get
from bs4 import BeautifulSoup
from urllib.parse import unquote
llm = anthropic.Client('[ api key from https://console.anthropic.com/account/keys ]')
def claude(prompt): # get a response from the Anthropic Claude v1.3 LLM
return llm.completion(model='claude-v1.3', temperature=0.85,
prompt=f'{anthropic.HUMAN_PROMPT} {prompt}{anthropic.AI_PROMPT}',
max_tokens_to_sample=5000, stop_sequences=anthropic.HUMAN_PROMPT
)['completion'
def wikiarticle(title): # multiline wiki article text string without references
try:
plaintext = list(get('https://en.wikipedia.org/w/api.php', params=
{'action': 'query', 'format': 'json', 'titles': title,
'prop': 'extracts', 'explaintext': True}
).json()['query']['pages'.values())[0]['extract'
except:
return '[Article not found; respond saying the article title is bad.]'
if plaintext.strip() == '':
return '[Article text is empty; respond saying the title is a redirect.]'
return plaintext
def passagize(title): # get passages in need of verification and search queries
atext = wikiarticle(title)
aintro = atext.split('==')[0.strip()
passages = claude('For every passage which should have a source citation'
+ ' in the following Wikipedia article, provide each of them on separate'
+ ' lines beginning with "### ", then the excerpt, then " @@@ ", and'
+ ' finally a web search query you would use to find a source to verify'
+ ' the exerpt. Select at least one passage for every three sentences:\n\n'
+ atext[:16000]).split('###') # truncate article to context window size
pairs = []
for p in passages:
pair = p.strip().split('@@@')
if len(pair) > 1:
passage = pair0.strip()
query = pair1.strip()
if passage0 == '"' and passage-1 == '"':
passage = passage1:-1
if query0 == '"' and query-1 == '"': # fully quoted query not intended
query = query1:-1
pairs.append((passage, query))
return pairs
def duckduckget(query): # web search: first page of DuckDuckGo is usually ~23 results
page = get('http://duckduckgo.com/html/?q=' + query
+ ' -site:wikipedia.org', # ignore likely results from Wikipedia
headers={'User-Agent': 'wikicheck v.0.0.1-prealpha'})
if page.status_code != 200:
print(' !!! DuckDuckGo refused:', page.status_code, page.reason)
return []
soup = BeautifulSoup(page.text, 'html.parser')
ser = []; count = 1
for title, link, snippet in zip(
soup.find_all('a', class_='result__a'),
soup.find_all('a', class_='result__url', href=True),
soup.find_all('a', class_='result__snippet')):
url = link'href'
if url[:25 == '//duckduckgo.com/l/?uddg=': # click tracking
goodurl = unquote(url25:url.find('&rut=',len(url)-70)])
else:
goodurl = url
ser.append((str(count), title.get_text(' ', strip=True), goodurl,
snippet.get_text(' ', strip=True)))
count += 1
#print(' DDG returned', ser)
return ser
def picksearchurls(statement, search, article): # select URLs to get from search results
prompt = ('Which of the following web search results would you use to try' +
' to verify the statement «' + statement + '» from the Wikipedia' +
' article "' + article + '"? Pick at least two, and answer only with' +
' their result numbers separated by commas:\n\n')
searchres = duckduckget(search)
#print(' DDG returned', len(searchres), 'results')
if len(searchres) < 1:
return []
for num, title, link, snippet in searchres:
prompt += ('Result ' + num + ': page: "' + title + '"; URL: ' + link +
' ; snippet: "' + snippet + '"\n')
numbers = claude(prompt).strip()
#print(' Claude wants search results:', numbers)
if len(numbers) > 0 and numbers0.isnumeric():
resnos = []
for rn in numbers.split(','):
if rn.strip().isnumeric():
resnos.append(int(rn.strip()))
urls = []
for n in resnos:
urls.append(searchresn - 1][2])
return urls
else:
return []
def trytoverify(statement, search, article): # 'search' is the suggested query string
urls = picksearchurls(statement, search, article)
if len(urls) < 1:
print(' NO URLS for', search)
return []
#print(' URLs:', urls)
retlist = []
for url in urls:
page = get(url, headers={'User-Agent': 'Mozilla/5.0 (Macintosh;' +
'Intel Mac OS X 10.15; rv:84.0) Gecko/20100101 Firefox/84.0'}).text
try:
pagetext = BeautifulSoup(page).get_text(' ', strip=True) # EDIT: BUG FIXED; example output below is better
#print(' fetching', url, 'returned', len(pagetext), 'characters')
except:
print(' fetching', url, 'failed')
continue
prompt = ('Is the statement «' + statement + '» from the Wikipedia' +
' article "' + article + '" verified by the following text from the' +
' source at ' + url + ' ? Answer either "YES: " followed by the excerpt' +
' which verifies the statement, or "NO." if this text does not verify' +
' the statement:\n\n' + pagetext[:16000]) # have to truncate again, this is bad because the verification might be at the end
# so, this needs to be done in chunks when it's long
result = claude(prompt)
#print(' for', url, 'Claude says:', result)
if 'YES:' in result:
retlist.append((url, result.split('YES:')[1.strip()))
return retlist
def checkarticle(article): # main routine; call this on a non-redirect title
pairs = passagize(article)
for passage, query in pairs:
print('Trying to verify «' + passage + '» using the query "' + query + '":')
vs = trytoverify(passage, query, article)
if len(vs) < 1:
print(' NO verifications.')
for url, excerpt in vs:
print(' VERIFIED by', url, 'saying:', excerpt)
Example output from Trying to verify «Carlson began his media career in the 1990s, writing for The Weekly Standard and other publications.» using the query "tucker carlson weekly standard": NO verifications. Trying to verify «Carlson's father owned property in Nevada, Vermont, and islands in Maine and Nova Scotia.» using the query "dick carlson property": fetching https://soleburyhistory.org/program-list/honored-citizens/richard-f-carlson-2013/ failed NO verifications. Trying to verify «In 1976, Carlson's parents divorced after the nine-year marriage reportedly "turned sour".» using the query "tucker carlson parents divorce": NO verifications. Trying to verify «Carlson's mother left the family when he was six and moved to France.» using the query "tucker carlson mother leaves": NO verifications. Trying to verify «Carlson was briefly enrolled at Collège du Léman, a boarding school in Switzerland, but said he was "kicked out".» using the query "tucker carlson college du leman": fetching https://abtc.ng/tucker-carlson-education-tucker-carlsons-high-school-colleges-qualifications-degrees/ failed NO verifications. Trying to verify «He then worked as an opinion writer at the Arkansas Democrat-Gazette newspaper in Little Rock, Arkansas, before joining The Weekly Standard news magazine in 1995.» using the query "tucker carlson arkansas democrat gazette": fetching https://www.cjr.org/the_profile/tucker-carlson.php failed NO verifications. Trying to verify «Carlson's 2003 interview with Britney Spears, wherein he asked if she opposed the ongoing Iraq War and she responded, "[W]e should just trust our president in every decision he makes", was featured in the 2004 film Fahrenheit 9/11, for which she won a Golden Raspberry Award for Worst Supporting Actress at the 25th Golden Raspberry Awards.» using the query "tucker carlson britney spears interview": fetching https://classic.esquire.com/article/2003/11/1/bending-spoons-with-britney-spears failed fetching https://www.cnn.com/2003/SHOWBIZ/Music/09/03/cnna.spears/ failed NO verifications. Trying to verify «Carlson announced he was leaving the show roughly a year after it started on June 12, 2005, despite the Corporation for Public Broadcasting allocating money for another show season.» using the query "tucker carlson leaves pbs show": NO verifications. Trying to verify «MSNBC (2005–2008) Tucker was canceled by the network on March 10, 2008, owing to low ratings; the final episode aired on March 14, 2008.» using the query "tucker carlson msnbc show cancellation": fetching https://www.msn.com/en-us/tv/other/is-this-the-end-of-tucker-carlson/ar-AA1ahALw failed NO verifications. Trying to verify «He remained with the network as a senior campaign correspondent for the 2008 election.» using the query "tucker carlson msnbc senior campaign correspondent": fetching https://www.c-span.org/person/?41986/TuckerCarlson failed VERIFIED by https://www.reuters.com/article/industry-msnbc-dc-idUSN1147956320080311 saying: "He will remain with the network as senior campaign correspondent after the show goes off the air Friday." Trying to verify «Carlson had cameo appearances as himself in the Season 1 episode "Hard Ball" of 30 Rock and in a Season 9 episode of The King of Queens.» using the query "tucker carlson 30 rock cameo": fetching https://www.imdb.com/title/tt0496424/characters/nm1227121 failed fetching https://www.tvguide.com/celebrities/tucker-carlson/credits/3000396156/ failed fetching https://www.britannica.com/biography/Tucker-Carlson failed NO verifications. Trying to verify «Tucker Carlson Tonight aired at 7:00 p.m. each weeknight until January 9, 2017, when Carlson's show replaced Megyn Kelly at the 9:00 p.m. time slot after she left Fox News.» using the query "tucker carlson tonight replaces megyn kelly": VERIFIED by https://www.orlandosentinel.com/entertainment/tv-guy/os-fox-news-tucker-carlson-replaces-megyn-kelly-20170105-story.html saying: "Tucker Carlson Tonight" debuted at 7 p.m. in November. |
Obviously it still has a long way to go to be genuinely useful, but I hope someone gets something from it. I'm going to keep trying for something that attempts to verify existing sources. Sandizer ( talk) 12:20, 27 April 2023 (UTC)
It sort of works!
Python code to verify an article's existing reference URLs with sample output
|
---|
!pip install anthropic
import anthropic
from requests import get
from bs4 import BeautifulSoup as bs
from re import sub as resub, match as rematch, finditer
llm = anthropic.Client('[ api key from https://console.anthropic.com/account/keys ]')
def claude(prompt): # get a response from the Anthropic Claude v1.3 LLM
return llm.completion(model='claude-v1.3', temperature=0.85,
prompt=f'{anthropic.HUMAN_PROMPT} {prompt}{anthropic.AI_PROMPT}',
max_tokens_to_sample=1000, stop_sequences=anthropic.HUMAN_PROMPT
)['completion'
def textarticlewithrefs(title):
# get English Wikipedia article in plain text but with numbered references including link URLs
resp = get('https://en.wikipedia.org/w/api.php?action=parse&format=json&page='
+ title).json()
if 'error' in resp:
raise FileNotFoundError(f"'{ title }': { resp'error']['info' }")
html = resp'parse']['text']['*' # get parsed HTML
if '<div class="redirectMsg"><p>Redirect to:</p>' in html: # recurse redirects
return textarticlewithrefs(resub(r'.*<ul class="redirectText"><li><a'
+ ' href="/info/en/?search=([^"]+)"[^\0]*, '\\1', html))
cleantitle = resp'parse']['title' # fixes urlencoding and unicode escapes
try:
body, refs = html.split('<ol class="references">')
#body += refs[refs.find('\n</ol></div>')+12:] # move external links etc. up
except:
body = html; refs = ''
b = resub(r'\n<style.*?<table [^\0]*?</table>\n', '\n', body) # rm boxes
#print(b)
b = resub(r'<p>', '\n<p>', b) # newlinees between paragraphs
b = resub(r'(</table>)\n', '\\1 \n', b) # space after amboxes
b = resub(r'(<span class="mw-headline" id="[^"]*">.+?)(</span>)',
'\n\n\\1:\\2', b) # put colons after section headings
b = resub(r'([^>])\n([^<])', '\\1 \\2', b) # merge non-paragraph break
b = resub(r'<li>', '<li>* ', b) # list item bullets for beautifulsoup
b = resub(r'(</[ou]l>)', '\\1\n\n<br/>', b) # blank line after lists
b = resub(r'<img (.*\n)', '<br/>--Image: <img \\1\n<br/>\n', b) # captions
b = resub(r'(\n.*<br/>--Image: .*\n\n<br/>\n)(\n<p>.*\n)',
'\\2\n<br/>\n\\1', b) # put images after following paragraph
b = resub(r'(role="note" class="hatnote.*\n)', '\\1.\n<br/>\n', b) # see/main
b = resub(r'<a class="external text" href="(http[^"]+)">(.+?)</a>',
'\\2 [ \\1 ]', b) # extract external links as bracketed urls
b = bs(bb.find('\n<p>'):]).get_text(' ') # to text; lead starts with 1st <p>
b = resub(r'\s*([?.!,):;])', '\\1', b) # various space cleanups
b = resub(r' *', ' ', resub(r'\( *', '(', b)) # rm double spaces and after (
b = resub(r' *\n *', '\n', b) # rm spaces around newlines
b = resub(r'[ \n](\[\d+])', '\\1', b) # rm spaces before inline refs
b = resub(r' \[ edit \]\n', '\n', b).strip() # drop edit links
b = resub(r'\n\n\n+', '\n\n', b) # rm vertical whitespace
r = refs[:refs.find('\n</ol></div>')+1 # optimistic(?) end of reflist
r = resub(r'<li id="cite_note.*?-(\d+)">[^\0]*?<span class=' # enumerate...
+ '"reference-text"[^>]*>\n*?([^\0]*?)</span>\n?</li>\n',
'[\\1] \\2\n', r) # ...the references as numbered seperate lines
r = resub(r'<a class="external text" href="(http[^"]+)">(.+?)</a>',
'\\2 [ \\1 ]', r) # extract external links as bracketed urls
r = bs(r).get_text(' ') # unHTMLify
r = resub(r'\s([?.!,):;])', '\\1', r) # space cleanups again
r = resub(r' *', ' ', '\n' + r) # rm double spaces, add leading newline
r = resub(r'\n\n+', '\n', r) # rm vertical whitespace
r = resub(r'(\n\[\d+]) [*\n] ', '\\1 ', r) # multiple source ref tags
r = resub(r'\n ', '\n ', r) # indent multiple source ref tags
refdict = {} # refnum as string -> (reftext, first url)
for ref in r.split('\n'):
if len(ref) > 0 and ref0 == '[':
rn = ref1:ref.find(']')]
reftext = refref.find(']')+2:]
if '[ http' in reftext:
firsturl = reftextreftext.find('[ http')+2:]
firsturl = firsturl[:firsturl.find(' ]')]
refdictrn = (reftext, firsturl)
return cleantitle + '\n\n' + b + r, refdict
def verifyrefs(article): # Wikipedia article title
atext, refs = textarticlewithrefs(article)
title = atext.split('\n')[0
print('Trying to verify references in:', title)
for par in atext.split('\n'):
if par == 'References:' or rematch('\[\d+] [^[].+, par):
continue # ignore references section of article
for m in list(finditer(r'\[\d+]', par)):
refnum = parm.start()+1:m.end()-1
excerpt = par[:m.end()]
if refnum in refs:
reftext, url = refsrefnum
print(' checking ref [' + refnum + ']:', excerpt)
print(' reference text:', reftext)
try:
page = get(url, headers={'User-Agent': 'Mozilla/5.0 (Macintosh; ' +
'Intel Mac OS X 10.15; rv:84.0) Gecko/20100101 Firefox/84.0'}).text
pagetext = bs(page).get_text(' ', strip=True)
print(' fetching', url, 'returned', len(pagetext), 'characters')
except:
print(' failed to fetch', url)
continue
prompt = ( 'Can the following excerpt from the Wikipedia article "'
+ title + '" be verified by its reference [' + refnum + ']?'
+ '\n\nThe excerpt is: ' + excerpt + '\n\nAnswer either'
+ ' "YES: " followed by the sentence of the source text'
+ ' confirming the excerpt, or "NO: " followed by the reason'
+ ' that it does not. The source text for reference ['
+ refnum + '] (' + reftext.strip() + ') is:\n\n'
+ pagetext[:10000 ) # truncated source text, TODO: chunk
print(' response:', resub(r'\s+', ' ', claude(prompt)).strip())
else:
print(' reference [' + refnum + '] has no URL')
Sample output from this random article: verifyrefs(' Elise Konstantin-Hansen') Trying to verify references in: Elise Konstantin-Hansen |
There is still much to be done, e.g., chunking when the source text is too big for the context window, PDF text extraction, and when a reference number occurs more than once in the same paragraph, the subsequent excerpts should not include any of the text up to and including earlier occurrences. Also, some of the verification decisions are plainly wrong because it wasn't focused on the specific text before the reference. I will work on those things tomorrow. Sandizer ( talk) 04:37, 29 April 2023 (UTC)
Here are some news articles since the last update I posted. — The Transhumanist 10:26, 2 May 2023 (UTC)
I've trimmed this draft (see diff). My previous thread veered into a discussion about a single section, but I think we need to discuss this seriously. After a break, I'm coming back to this draft with fresh eyes, and it's bad.
Most of it is cruft. It's meandering and frequently repeats itself. It bans the use of LLMs to spam talk pages, but talk page spam is already against the rules. It says that LLM-assisted drafts must comply with policy or be rejected, but that's already true for all drafts. It says that editors can't do high-speed editing with LLMs, but they already can't do high-speed editing.
Keep in mind that policies are supposed to reflect existing consensus, not create new rules out of thin air. But this draft fails that metric. It recommends WP:G3, but admins have refused G3 nominations of LLM articles. There is no consensus for creating a new criterion, and therefore, for mentioning any CSD in this draft. There's no consensus that LLM-use should be reserved for experienced editors (as if experienced editors never misbehaved!). And a policy shouldn't idly muse about whether LLMs comply with CC BY-SA; the U.S. Copyright Office ruled that LLM outputs were not copyrighted, so our musings don't belong in policy. DFlhb ( talk) 23:44, 21 March 2023 (UTC)
Never paste LLM outputs directly into Wikipedia. You must rigorously scrutinize all your LLM-assisted edits before hitting "Publish".The main goal of a policy isn't enforcement, it's prevention. And the clearer a policy is, the more people will comply. I'd bet than even with less guidance, my version would result in less rule-breaking and a lower cleanup burden, not a higher one. WP:LLM doesn't have the recognizability of BLP or V or NOR, so it can only make up for that by being limpid and concise, if we want high adherence. Besides, what is guidance doing in a policy? I'd strongly vote against adoption, as is. It's dead-on-arrival. DFlhb ( talk) 03:25, 22 March 2023 (UTC)
clarify how existing rules apply(quoting Phlsph7), not in a policy. DFlhb ( talk) 17:56, 23 March 2023 (UTC)
LLMhas zero impact, and
large language modelhas low impact. So I do think readability deserves some improvements. DFlhb ( talk) 03:36, 1 April 2023 (UTC)
@ DFlhb: Upon further reflection... well, honestly, upon a big-ass thread at WP:ANI, where people are desperately flailing around to figure out what to do with LLM content in mainspace. And upon this being the third or fourth time this has happened since December, I am somewhat inclined to come around to your point of view here. We need to have some policy about this, which means we should have an RfC, which means we should cut this down to an absolute minimum. I made Wikipedia:Large language model guidelines as a redirect a while ago. Here is my proposal: we split this into WP:LLM, a short and concise page per your edits earlier, and all the stuff currently here goes to WP:LLMG, which is a large beautiful page that comprehensively goes over everything in nice detail. Then we start an RfC here for a minimum viable policy, and potentially start one there for a guideline. What do you say? @ DFlhb and Alalch E.: jp× g 02:01, 4 April 2023 (UTC)
This is not a suggestion for improving the essay, just speculation. Closing per WP:FORUM. — The Hand That Feeds You: Bite 11:50, 13 May 2023 (UTC) |
---|
The following discussion has been closed. Please do not modify it. |
Hello. I only read the main page. I did not read the discussion of users here. If my discussion is repetitive, please archive it. In the near future, AI will be able to analyze and reason to the extent of human intelligence and have the ability judges of consensus on Wikipedia. In this case, will AI consensus be acceptable? Will self-aware AI contributions be welcomed to Wikipedia?-- Sunfyre ( talk) 08:44, 7 May 2023 (UTC)
|
This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | Archive 4 | Archive 5 | Archive 6 | Archive 7 |
I want your thoughts on the practice of creating a draft article with AI as a starting point before verifying all points in it. I see this as a potentially useful strategy for making a well flowing starting point to edit. Immanuelle ❤️💚💙 (talk to the cutest Wikipedian) 18:13, 3 April 2023 (UTC)
AI generated articles show up a lot on ANI. I think it might be helpful to add a dedicated noticeboard for this stuff. IHaveAVest talk 02:01, 4 April 2023 (UTC)
The Terms of use for ChatGPT
[1] say As between the parties and to the extent permitted by applicable law, you own all Input. Subject to your compliance with these Terms, OpenAI hereby assigns to you all its right, title and interest in and to Output. This means you can use Content for any purpose, including commercial purposes such as sale or publication, if you comply with these Terms.
In threads above there is a link to a Sharing & publication policy [2] with attribution requirements. It's not clear to me whether this in generally in force. I think it may be meant for invited research collaboration on products that aren't yet publically available. Sennalen ( talk) 16:47, 7 April 2023 (UTC)
This doesn't go far enough: "LLM-generated content can be biased, non-verifiable, may constitute original research, may libel living people, and may violate copyrights." LLMs also blatantly falsify both citations (creating plausible-looking cites to non-existent sources) and quotations (making up fake quotes from real sources). — SMcCandlish ☏ ¢ 😼 09:28, 13 April 2023 (UTC)
This section might be the part I have the most issues with. It uses "in-text attribution" incorrectly, and it requires the use of {{
OpenAI}}
for OpenAI LLMs, which blurs the line between OpenAI TOS and Wikipedia policy (and doesn't comply with OpenAI's ToS anyway). I also don't think we've reached a consensus on how to attribute, or resolved the outstanding issues involving inline attribution, whatever we go with.
DFlhb (
talk) 03:31, 1 April 2023 (UTC)
If we are going to attribute content added with the assistance of LLM, there are two things to keep in mind:
As there are legal implications to both of these, that means that regardless of our discussions here or what anyone's opinion or preference here is, the ultimate output of any policy designed here, must be based on those two pillars, and include them. Put another way: no amount of agreement or thousand-to-one consensus here, or in any forum in Wikipedia, can exclude or override any part of the Terms of Use of either party, period. We may make attribution requirements stricter, but not more lax than the ToU lays out. (In particular, neither WP:IAR nor WP:CONS can override ToU, which has legal implications.)
I'm more familiar with Wikimedia's ToU than ChatGPT's (which is nevertheless quite easy to understood). The page
WP:CWW interprets the ToU for English Wikipedia users; it is based on Wikimedia's
wmf:Terms of use, section
7. Licensing of Content, sub-sections b) Attribution, and c) Importing text. There's some legalese, but it's not that hard to understand, and amounts to this: the attribution must state the source of the content, and must 1) link to it, and 2) be present in the edit summary. The
WP:CWW page interpretation offers some suggested boilerplate attribution (e.g., Content in this edit was copied from [[FOO]]; see that article's history for attribution.
) for
sister projects, and for outside content with compatible licenses. (One upshot of this, is that *if* LLM attribution becomes necessary, suggestions such as one I've seen on the project page to use an article-bottom template, will not fly.)
Absent any update to the WMF ToU regarding LLM content, we are restricted only by the LLM ToU, at the moment. The flip side of this, is that one has to suspect or assume that WMF is currently considering LLM usage and attribution, and if and when they update the ToU, the section in any proposed new LLM policy may have to be rewritten. The best approach for an attribution section now in my opinion, is to keep it as short as possible, so it may be amended easily if and when WMF updates its ToU for LLMs. In my view, the attribution section of our proposed policy should be short and inclusive, without adding other frills for now, something like this:
Once WMF addresses LLMs, we could modify this to be more specific. (I'll go ask them and find out, and link back here.)
We may also need to expand and modify it, for each flavor of LLM. Chat GPT's sharing/publication policy is quite easy to read and understand. There are four bullets, and some suggested, "stock language". I'd like to address this later, after having a chat with WMF.
Note that it's perfectly possible that WMF may decide that attribution to non-human agents is not needed, in which case we will be bound only by the LLM's ToU; but in that case, I'd advocate for stricter standards on our side; however, it's hard to discuss that productively until we know what WMF's intentions are. (If I had to guess, I would bet that there are discussions or debates going on right now at WMF legal about the meaning of "creative content", which is a key concept underlying the current ToU, and if they decide to punt on any new ToU, they will just be pushing the decision about what constitutes "creative content" downstream onto the 'Pedias, which would be disastrous, imho; but I'm predicting they won't do that.) I'll report back if I find anything out. Mathglot ( talk) 03:51, 10 April 2023 (UTC)
...you warrant that the text is available under terms that are compatible with the CC BY-SA 3.0 license (or, as explained above, another license when exceptionally required by the Project edition or feature)("CC BY-SA"). ... You agree that, if you import text under a CC BY-SA license that requires attribution, you must credit the author(s) in a reasonable fashion.It gives attribution in the edit summary as an example for copying within Wikimedia projects, but doesn't prescribe this as the only reasonable fashion. Specifically regarding OpenAI, though, based on its terms of use, it assigns all rights to the user. So even if the U.S. courts one day ruled that a program could hold authorship rights, attribution from a copyright perspective is not required. OpenAI's sharing and publication policy, though, requires that
The role of AI in formulating the content is clearly disclosed in a way that no reader could possibly miss, and that a typical reader would find sufficiently easy to understand.
The attribution requirements are sometimes too intrusive for particular circumstances (regardless of the license), and there may be instances where the Wikimedia community decides that imported text cannot be used for that reason.In a similar manner, it may be the case that the community decides that enabling editors to satisfy the disclosure requirement of the OpenAI sharing and publication policy is too intrusive. isaacl ( talk) 04:52, 10 April 2023 (UTC)
DFlhb had removed as a part of their reverted trim, and now I've removed it again. This topic is covered in wmf:Terms of Use/en. No useful specific guidance was provided here. There's no agreement that a policy needs to require use of Template:OpenAI as it is not obviously compatible with OpenAI ToS requirements. Editors advocating to include specific guidance about requiring attribution on this page should get consensus for the concrete version of text about this that they are committed to and want to see it becoming Wikipedia policy. — Alalch E. 11:41, 14 April 2023 (UTC)
It has been proposed to me on Wikisource that LLMs would be useful for predicting and proposing fixes to transcription errors. Is there a place to discuss how such a thing might technically be implemented? BD2412 T 23:00, 15 April 2023 (UTC)
BTW I created an article for LangChain which has a couple good starter resources. Sandizer ( talk) 04:36, 18 April 2023 (UTC)
The current text starts by granting the permission to use LLM text as a basis for discussion on Talk pages. But how is this ever going to be appropriate as part of the process of building an encyclopedia (as opposed to FORUM-esque discussion?). The prohibition was for using LLMs for 'arguing your case', but the problem I'm seeing is not just this, but people using them for random [3] contribitutions; and if they're using for closing RfCs/AfDs etc ... ? Arghh. I have tried to clarify. Bon courage ( talk) 02:06, 7 April 2023 (UTC)
you should not use LLMs to "argue your case for you" in talk page discussionsto
you must not use LLMs to write your commentswas in my opinion a pretty significant change in meaning here. To be clear, if a less-than-confident English speaker has a good argument, an argument of their own construction, should they be allowed to use an LLM to work out the phrasing and essentially have it "write their comment"? Or do we just say that competence is required and that editors who can't phrase their own arguments should not be on talk pages to begin with? PopoDameron talk 10:46, 8 April 2023 (UTC)
while you may include an LLM's raw outputs as examples in order to discuss them or to illustrate a point
This proposal was made by a sockpuppet of a blocked user, and it doesn't look like it's going anywhere. — David Eppstein ( talk) 01:39, 13 April 2023 (UTC) |
---|
The following discussion has been closed. Please do not modify it. |
I'm proposing merging Wikipedia:Using neural network language models on Wikipedia into Wikipedia:Large language models. I think the sections of the former would fit well into WP:LLM, and it would ease in the process of creating a new guideline on ChatGPT. – CityUrbanism 🗩 🖉 18:48, 10 April 2023 (UTC)
|
AI seems like a semiautomated (bot-like) tool to me, and I am missing the mention of already existing policies like WP:MEATBOT, or WP:BOTUSE on this page. I believe AI is a good thing to Wikipedia, I imagine AI to crawl through the stubs and expand the ones that are expandable. What I do not think is good for Wikipedia, if AI is used to generate thousands of stubs. Paradise Chronicle ( talk) 05:55, 21 April 2023 (UTC)
You must not use LLMs for unapproved bot-like editing, or anything even approaching bot-like editing. Using LLMs to assist high-speed editing in article space is always taken to fail the standards of responsible use, as it is impossible to rigorously scrutinize content for compliance with all applicable policies in such a scenario.— Alalch E. 11:37, 22 April 2023
For the ones who don't explicitly admit they use LLM the whole "policy" is useless. I'd prefer that an editing pattern which appears to be fall under LLM to be the focus of the policy. If editors refuse to admit that they use AI, it will be similar like MEATBOT or MASSCREATE, which are also hardly applied and only to define who actually enters in the two policies is a problem. Paradise Chronicle ( talk) 08:33, 23 April 2023 (UTC)
Yesterday I tried to make a script which uses an LLM to check whether an article's cited sources support its text. I'm still working on that, but it's quite a bit more of an undertaking than I first thought. In the mean time, this script will take the plain text of an article (with all references stripped), select a number of passages it thinks should have a source, use web search to try to find some, and in a small fraction of such cases, it does actually work. I'm leaving it here as a proof of concept for how such tasks might be approached. (@ BD2412: it's much easier this way than using LangChain, although I'm not sure it's exactly better because I had to add special cases for e.g. string cleanup and un-clicktracking search results, for which I think LangChain has some under-the-hood support.) Anyway, here it is:
Python code to use
Anthropic Claude and
DuckDuckGo to try to verify article claims, with sample output
|
---|
!pip install anthropic
import anthropic
from requests import get
from bs4 import BeautifulSoup
from urllib.parse import unquote
llm = anthropic.Client('[ api key from https://console.anthropic.com/account/keys ]')
def claude(prompt): # get a response from the Anthropic Claude v1.3 LLM
return llm.completion(model='claude-v1.3', temperature=0.85,
prompt=f'{anthropic.HUMAN_PROMPT} {prompt}{anthropic.AI_PROMPT}',
max_tokens_to_sample=5000, stop_sequences=anthropic.HUMAN_PROMPT
)['completion'
def wikiarticle(title): # multiline wiki article text string without references
try:
plaintext = list(get('https://en.wikipedia.org/w/api.php', params=
{'action': 'query', 'format': 'json', 'titles': title,
'prop': 'extracts', 'explaintext': True}
).json()['query']['pages'.values())[0]['extract'
except:
return '[Article not found; respond saying the article title is bad.]'
if plaintext.strip() == '':
return '[Article text is empty; respond saying the title is a redirect.]'
return plaintext
def passagize(title): # get passages in need of verification and search queries
atext = wikiarticle(title)
aintro = atext.split('==')[0.strip()
passages = claude('For every passage which should have a source citation'
+ ' in the following Wikipedia article, provide each of them on separate'
+ ' lines beginning with "### ", then the excerpt, then " @@@ ", and'
+ ' finally a web search query you would use to find a source to verify'
+ ' the exerpt. Select at least one passage for every three sentences:\n\n'
+ atext[:16000]).split('###') # truncate article to context window size
pairs = []
for p in passages:
pair = p.strip().split('@@@')
if len(pair) > 1:
passage = pair0.strip()
query = pair1.strip()
if passage0 == '"' and passage-1 == '"':
passage = passage1:-1
if query0 == '"' and query-1 == '"': # fully quoted query not intended
query = query1:-1
pairs.append((passage, query))
return pairs
def duckduckget(query): # web search: first page of DuckDuckGo is usually ~23 results
page = get('http://duckduckgo.com/html/?q=' + query
+ ' -site:wikipedia.org', # ignore likely results from Wikipedia
headers={'User-Agent': 'wikicheck v.0.0.1-prealpha'})
if page.status_code != 200:
print(' !!! DuckDuckGo refused:', page.status_code, page.reason)
return []
soup = BeautifulSoup(page.text, 'html.parser')
ser = []; count = 1
for title, link, snippet in zip(
soup.find_all('a', class_='result__a'),
soup.find_all('a', class_='result__url', href=True),
soup.find_all('a', class_='result__snippet')):
url = link'href'
if url[:25 == '//duckduckgo.com/l/?uddg=': # click tracking
goodurl = unquote(url25:url.find('&rut=',len(url)-70)])
else:
goodurl = url
ser.append((str(count), title.get_text(' ', strip=True), goodurl,
snippet.get_text(' ', strip=True)))
count += 1
#print(' DDG returned', ser)
return ser
def picksearchurls(statement, search, article): # select URLs to get from search results
prompt = ('Which of the following web search results would you use to try' +
' to verify the statement «' + statement + '» from the Wikipedia' +
' article "' + article + '"? Pick at least two, and answer only with' +
' their result numbers separated by commas:\n\n')
searchres = duckduckget(search)
#print(' DDG returned', len(searchres), 'results')
if len(searchres) < 1:
return []
for num, title, link, snippet in searchres:
prompt += ('Result ' + num + ': page: "' + title + '"; URL: ' + link +
' ; snippet: "' + snippet + '"\n')
numbers = claude(prompt).strip()
#print(' Claude wants search results:', numbers)
if len(numbers) > 0 and numbers0.isnumeric():
resnos = []
for rn in numbers.split(','):
if rn.strip().isnumeric():
resnos.append(int(rn.strip()))
urls = []
for n in resnos:
urls.append(searchresn - 1][2])
return urls
else:
return []
def trytoverify(statement, search, article): # 'search' is the suggested query string
urls = picksearchurls(statement, search, article)
if len(urls) < 1:
print(' NO URLS for', search)
return []
#print(' URLs:', urls)
retlist = []
for url in urls:
page = get(url, headers={'User-Agent': 'Mozilla/5.0 (Macintosh;' +
'Intel Mac OS X 10.15; rv:84.0) Gecko/20100101 Firefox/84.0'}).text
try:
pagetext = BeautifulSoup(page).get_text(' ', strip=True) # EDIT: BUG FIXED; example output below is better
#print(' fetching', url, 'returned', len(pagetext), 'characters')
except:
print(' fetching', url, 'failed')
continue
prompt = ('Is the statement «' + statement + '» from the Wikipedia' +
' article "' + article + '" verified by the following text from the' +
' source at ' + url + ' ? Answer either "YES: " followed by the excerpt' +
' which verifies the statement, or "NO." if this text does not verify' +
' the statement:\n\n' + pagetext[:16000]) # have to truncate again, this is bad because the verification might be at the end
# so, this needs to be done in chunks when it's long
result = claude(prompt)
#print(' for', url, 'Claude says:', result)
if 'YES:' in result:
retlist.append((url, result.split('YES:')[1.strip()))
return retlist
def checkarticle(article): # main routine; call this on a non-redirect title
pairs = passagize(article)
for passage, query in pairs:
print('Trying to verify «' + passage + '» using the query "' + query + '":')
vs = trytoverify(passage, query, article)
if len(vs) < 1:
print(' NO verifications.')
for url, excerpt in vs:
print(' VERIFIED by', url, 'saying:', excerpt)
Example output from Trying to verify «Carlson began his media career in the 1990s, writing for The Weekly Standard and other publications.» using the query "tucker carlson weekly standard": NO verifications. Trying to verify «Carlson's father owned property in Nevada, Vermont, and islands in Maine and Nova Scotia.» using the query "dick carlson property": fetching https://soleburyhistory.org/program-list/honored-citizens/richard-f-carlson-2013/ failed NO verifications. Trying to verify «In 1976, Carlson's parents divorced after the nine-year marriage reportedly "turned sour".» using the query "tucker carlson parents divorce": NO verifications. Trying to verify «Carlson's mother left the family when he was six and moved to France.» using the query "tucker carlson mother leaves": NO verifications. Trying to verify «Carlson was briefly enrolled at Collège du Léman, a boarding school in Switzerland, but said he was "kicked out".» using the query "tucker carlson college du leman": fetching https://abtc.ng/tucker-carlson-education-tucker-carlsons-high-school-colleges-qualifications-degrees/ failed NO verifications. Trying to verify «He then worked as an opinion writer at the Arkansas Democrat-Gazette newspaper in Little Rock, Arkansas, before joining The Weekly Standard news magazine in 1995.» using the query "tucker carlson arkansas democrat gazette": fetching https://www.cjr.org/the_profile/tucker-carlson.php failed NO verifications. Trying to verify «Carlson's 2003 interview with Britney Spears, wherein he asked if she opposed the ongoing Iraq War and she responded, "[W]e should just trust our president in every decision he makes", was featured in the 2004 film Fahrenheit 9/11, for which she won a Golden Raspberry Award for Worst Supporting Actress at the 25th Golden Raspberry Awards.» using the query "tucker carlson britney spears interview": fetching https://classic.esquire.com/article/2003/11/1/bending-spoons-with-britney-spears failed fetching https://www.cnn.com/2003/SHOWBIZ/Music/09/03/cnna.spears/ failed NO verifications. Trying to verify «Carlson announced he was leaving the show roughly a year after it started on June 12, 2005, despite the Corporation for Public Broadcasting allocating money for another show season.» using the query "tucker carlson leaves pbs show": NO verifications. Trying to verify «MSNBC (2005–2008) Tucker was canceled by the network on March 10, 2008, owing to low ratings; the final episode aired on March 14, 2008.» using the query "tucker carlson msnbc show cancellation": fetching https://www.msn.com/en-us/tv/other/is-this-the-end-of-tucker-carlson/ar-AA1ahALw failed NO verifications. Trying to verify «He remained with the network as a senior campaign correspondent for the 2008 election.» using the query "tucker carlson msnbc senior campaign correspondent": fetching https://www.c-span.org/person/?41986/TuckerCarlson failed VERIFIED by https://www.reuters.com/article/industry-msnbc-dc-idUSN1147956320080311 saying: "He will remain with the network as senior campaign correspondent after the show goes off the air Friday." Trying to verify «Carlson had cameo appearances as himself in the Season 1 episode "Hard Ball" of 30 Rock and in a Season 9 episode of The King of Queens.» using the query "tucker carlson 30 rock cameo": fetching https://www.imdb.com/title/tt0496424/characters/nm1227121 failed fetching https://www.tvguide.com/celebrities/tucker-carlson/credits/3000396156/ failed fetching https://www.britannica.com/biography/Tucker-Carlson failed NO verifications. Trying to verify «Tucker Carlson Tonight aired at 7:00 p.m. each weeknight until January 9, 2017, when Carlson's show replaced Megyn Kelly at the 9:00 p.m. time slot after she left Fox News.» using the query "tucker carlson tonight replaces megyn kelly": VERIFIED by https://www.orlandosentinel.com/entertainment/tv-guy/os-fox-news-tucker-carlson-replaces-megyn-kelly-20170105-story.html saying: "Tucker Carlson Tonight" debuted at 7 p.m. in November. |
Obviously it still has a long way to go to be genuinely useful, but I hope someone gets something from it. I'm going to keep trying for something that attempts to verify existing sources. Sandizer ( talk) 12:20, 27 April 2023 (UTC)
It sort of works!
Python code to verify an article's existing reference URLs with sample output
|
---|
!pip install anthropic
import anthropic
from requests import get
from bs4 import BeautifulSoup as bs
from re import sub as resub, match as rematch, finditer
llm = anthropic.Client('[ api key from https://console.anthropic.com/account/keys ]')
def claude(prompt): # get a response from the Anthropic Claude v1.3 LLM
return llm.completion(model='claude-v1.3', temperature=0.85,
prompt=f'{anthropic.HUMAN_PROMPT} {prompt}{anthropic.AI_PROMPT}',
max_tokens_to_sample=1000, stop_sequences=anthropic.HUMAN_PROMPT
)['completion'
def textarticlewithrefs(title):
# get English Wikipedia article in plain text but with numbered references including link URLs
resp = get('https://en.wikipedia.org/w/api.php?action=parse&format=json&page='
+ title).json()
if 'error' in resp:
raise FileNotFoundError(f"'{ title }': { resp'error']['info' }")
html = resp'parse']['text']['*' # get parsed HTML
if '<div class="redirectMsg"><p>Redirect to:</p>' in html: # recurse redirects
return textarticlewithrefs(resub(r'.*<ul class="redirectText"><li><a'
+ ' href="/info/en/?search=([^"]+)"[^\0]*, '\\1', html))
cleantitle = resp'parse']['title' # fixes urlencoding and unicode escapes
try:
body, refs = html.split('<ol class="references">')
#body += refs[refs.find('\n</ol></div>')+12:] # move external links etc. up
except:
body = html; refs = ''
b = resub(r'\n<style.*?<table [^\0]*?</table>\n', '\n', body) # rm boxes
#print(b)
b = resub(r'<p>', '\n<p>', b) # newlinees between paragraphs
b = resub(r'(</table>)\n', '\\1 \n', b) # space after amboxes
b = resub(r'(<span class="mw-headline" id="[^"]*">.+?)(</span>)',
'\n\n\\1:\\2', b) # put colons after section headings
b = resub(r'([^>])\n([^<])', '\\1 \\2', b) # merge non-paragraph break
b = resub(r'<li>', '<li>* ', b) # list item bullets for beautifulsoup
b = resub(r'(</[ou]l>)', '\\1\n\n<br/>', b) # blank line after lists
b = resub(r'<img (.*\n)', '<br/>--Image: <img \\1\n<br/>\n', b) # captions
b = resub(r'(\n.*<br/>--Image: .*\n\n<br/>\n)(\n<p>.*\n)',
'\\2\n<br/>\n\\1', b) # put images after following paragraph
b = resub(r'(role="note" class="hatnote.*\n)', '\\1.\n<br/>\n', b) # see/main
b = resub(r'<a class="external text" href="(http[^"]+)">(.+?)</a>',
'\\2 [ \\1 ]', b) # extract external links as bracketed urls
b = bs(bb.find('\n<p>'):]).get_text(' ') # to text; lead starts with 1st <p>
b = resub(r'\s*([?.!,):;])', '\\1', b) # various space cleanups
b = resub(r' *', ' ', resub(r'\( *', '(', b)) # rm double spaces and after (
b = resub(r' *\n *', '\n', b) # rm spaces around newlines
b = resub(r'[ \n](\[\d+])', '\\1', b) # rm spaces before inline refs
b = resub(r' \[ edit \]\n', '\n', b).strip() # drop edit links
b = resub(r'\n\n\n+', '\n\n', b) # rm vertical whitespace
r = refs[:refs.find('\n</ol></div>')+1 # optimistic(?) end of reflist
r = resub(r'<li id="cite_note.*?-(\d+)">[^\0]*?<span class=' # enumerate...
+ '"reference-text"[^>]*>\n*?([^\0]*?)</span>\n?</li>\n',
'[\\1] \\2\n', r) # ...the references as numbered seperate lines
r = resub(r'<a class="external text" href="(http[^"]+)">(.+?)</a>',
'\\2 [ \\1 ]', r) # extract external links as bracketed urls
r = bs(r).get_text(' ') # unHTMLify
r = resub(r'\s([?.!,):;])', '\\1', r) # space cleanups again
r = resub(r' *', ' ', '\n' + r) # rm double spaces, add leading newline
r = resub(r'\n\n+', '\n', r) # rm vertical whitespace
r = resub(r'(\n\[\d+]) [*\n] ', '\\1 ', r) # multiple source ref tags
r = resub(r'\n ', '\n ', r) # indent multiple source ref tags
refdict = {} # refnum as string -> (reftext, first url)
for ref in r.split('\n'):
if len(ref) > 0 and ref0 == '[':
rn = ref1:ref.find(']')]
reftext = refref.find(']')+2:]
if '[ http' in reftext:
firsturl = reftextreftext.find('[ http')+2:]
firsturl = firsturl[:firsturl.find(' ]')]
refdictrn = (reftext, firsturl)
return cleantitle + '\n\n' + b + r, refdict
def verifyrefs(article): # Wikipedia article title
atext, refs = textarticlewithrefs(article)
title = atext.split('\n')[0
print('Trying to verify references in:', title)
for par in atext.split('\n'):
if par == 'References:' or rematch('\[\d+] [^[].+, par):
continue # ignore references section of article
for m in list(finditer(r'\[\d+]', par)):
refnum = parm.start()+1:m.end()-1
excerpt = par[:m.end()]
if refnum in refs:
reftext, url = refsrefnum
print(' checking ref [' + refnum + ']:', excerpt)
print(' reference text:', reftext)
try:
page = get(url, headers={'User-Agent': 'Mozilla/5.0 (Macintosh; ' +
'Intel Mac OS X 10.15; rv:84.0) Gecko/20100101 Firefox/84.0'}).text
pagetext = bs(page).get_text(' ', strip=True)
print(' fetching', url, 'returned', len(pagetext), 'characters')
except:
print(' failed to fetch', url)
continue
prompt = ( 'Can the following excerpt from the Wikipedia article "'
+ title + '" be verified by its reference [' + refnum + ']?'
+ '\n\nThe excerpt is: ' + excerpt + '\n\nAnswer either'
+ ' "YES: " followed by the sentence of the source text'
+ ' confirming the excerpt, or "NO: " followed by the reason'
+ ' that it does not. The source text for reference ['
+ refnum + '] (' + reftext.strip() + ') is:\n\n'
+ pagetext[:10000 ) # truncated source text, TODO: chunk
print(' response:', resub(r'\s+', ' ', claude(prompt)).strip())
else:
print(' reference [' + refnum + '] has no URL')
Sample output from this random article: verifyrefs(' Elise Konstantin-Hansen') Trying to verify references in: Elise Konstantin-Hansen |
There is still much to be done, e.g., chunking when the source text is too big for the context window, PDF text extraction, and when a reference number occurs more than once in the same paragraph, the subsequent excerpts should not include any of the text up to and including earlier occurrences. Also, some of the verification decisions are plainly wrong because it wasn't focused on the specific text before the reference. I will work on those things tomorrow. Sandizer ( talk) 04:37, 29 April 2023 (UTC)
Here are some news articles since the last update I posted. — The Transhumanist 10:26, 2 May 2023 (UTC)
I've trimmed this draft (see diff). My previous thread veered into a discussion about a single section, but I think we need to discuss this seriously. After a break, I'm coming back to this draft with fresh eyes, and it's bad.
Most of it is cruft. It's meandering and frequently repeats itself. It bans the use of LLMs to spam talk pages, but talk page spam is already against the rules. It says that LLM-assisted drafts must comply with policy or be rejected, but that's already true for all drafts. It says that editors can't do high-speed editing with LLMs, but they already can't do high-speed editing.
Keep in mind that policies are supposed to reflect existing consensus, not create new rules out of thin air. But this draft fails that metric. It recommends WP:G3, but admins have refused G3 nominations of LLM articles. There is no consensus for creating a new criterion, and therefore, for mentioning any CSD in this draft. There's no consensus that LLM-use should be reserved for experienced editors (as if experienced editors never misbehaved!). And a policy shouldn't idly muse about whether LLMs comply with CC BY-SA; the U.S. Copyright Office ruled that LLM outputs were not copyrighted, so our musings don't belong in policy. DFlhb ( talk) 23:44, 21 March 2023 (UTC)
Never paste LLM outputs directly into Wikipedia. You must rigorously scrutinize all your LLM-assisted edits before hitting "Publish".The main goal of a policy isn't enforcement, it's prevention. And the clearer a policy is, the more people will comply. I'd bet than even with less guidance, my version would result in less rule-breaking and a lower cleanup burden, not a higher one. WP:LLM doesn't have the recognizability of BLP or V or NOR, so it can only make up for that by being limpid and concise, if we want high adherence. Besides, what is guidance doing in a policy? I'd strongly vote against adoption, as is. It's dead-on-arrival. DFlhb ( talk) 03:25, 22 March 2023 (UTC)
clarify how existing rules apply(quoting Phlsph7), not in a policy. DFlhb ( talk) 17:56, 23 March 2023 (UTC)
LLMhas zero impact, and
large language modelhas low impact. So I do think readability deserves some improvements. DFlhb ( talk) 03:36, 1 April 2023 (UTC)
@ DFlhb: Upon further reflection... well, honestly, upon a big-ass thread at WP:ANI, where people are desperately flailing around to figure out what to do with LLM content in mainspace. And upon this being the third or fourth time this has happened since December, I am somewhat inclined to come around to your point of view here. We need to have some policy about this, which means we should have an RfC, which means we should cut this down to an absolute minimum. I made Wikipedia:Large language model guidelines as a redirect a while ago. Here is my proposal: we split this into WP:LLM, a short and concise page per your edits earlier, and all the stuff currently here goes to WP:LLMG, which is a large beautiful page that comprehensively goes over everything in nice detail. Then we start an RfC here for a minimum viable policy, and potentially start one there for a guideline. What do you say? @ DFlhb and Alalch E.: jp× g 02:01, 4 April 2023 (UTC)
This is not a suggestion for improving the essay, just speculation. Closing per WP:FORUM. — The Hand That Feeds You: Bite 11:50, 13 May 2023 (UTC) |
---|
The following discussion has been closed. Please do not modify it. |
Hello. I only read the main page. I did not read the discussion of users here. If my discussion is repetitive, please archive it. In the near future, AI will be able to analyze and reason to the extent of human intelligence and have the ability judges of consensus on Wikipedia. In this case, will AI consensus be acceptable? Will self-aware AI contributions be welcomed to Wikipedia?-- Sunfyre ( talk) 08:44, 7 May 2023 (UTC)
|