Kasutaja:Estopedist1/Andmepäringud

Maintenance of pagesRedigeeri

--

Disambiguation/DAB-relatedRedigeeri

--

redirect-relatedRedigeeri

--

POOLELI

kategooriatega (categories)/mallidega (templates) seonduv

--

--

vikidevahelised Quarry'idRedigeeri

AND iwlinks.iwl_title NOT LIKE "%,%ич"
AND iwlinks.iwl_title NOT LIKE "User_talk%"
AND iwlinks.iwl_title NOT LIKE "User:%"
AND page.page_title NOT LIKE "Estopedist1%"
AND page.page_title NOT LIKE "%Eesti_100"
AND page.page_title NOT LIKE "%Eesti_200"
AND page.page_title NOT LIKE "%Tõlketalgutest_osavõtjad_ja_tõlgitavad_artiklid%"
AND page.page_title NOT LIKE "%Soome-ugri_artiklivõistlus%"
AND page.page_title NOT LIKE "Oluliste_artiklite_eeskujud%"
AND page.page_title NOT LIKE "%Baden-Württembergi kreisid%"

--

    • ru puhul NS <>0,1:
eira kui ruwikit on kasutatud tõlkeks ja v-o saab kuidagi eemaldada RUref-malliga kaasatulev ??
AND iwlinks.iwl_title NOT LIKE "%,%ич"
:AND iwlinks.iwl_title NOT LIKE "%армейский_корпус%"
AND iwlinks.iwl_title NOT LIKE "%корпус"
AND iwlinks.iwl_title NOT LIKE "%авиационный_полк%"
AND iwlinks.iwl_title NOT LIKE "%полк"
:AND iwlinks.iwl_title NOT LIKE "%Балтийского_флота"
AND iwlinks.iwl_title NOT LIKE "%Михайлович_(%"
AND page.page_title NOT LIKE "Polotski_vürstiriik" #8+ ruwiki linki
:AND page.page_title NOT LIKE "Püha_Andrease_ordeni_kavaleride_loend" #10+ ruwiki linki
AND page.page_title NOT LIKE "Valgevene_valitsejad" #8+ ruwiki linki
AND page.page_title NOT LIKE "Vene_fašism" #8+ ruwiki linki
AND page.page_title NOT LIKE "23._jalaväediviis_(Venemaa_Keisririik)" #8+ ruwiki linki
AND page.page_title NOT LIKE "Arkadi_Strugatski" #8+ ruwiki linki
AND page.page_title NOT LIKE "Boriss_Strugatski" #8+ ruwiki linki
: AND page.page_title NOT LIKE "Balti_laevastik_Teises_maailmasõjas" #8+ ruwiki linki
: AND page.page_title NOT LIKE "Keskpäeva_Maailm" #8+ ruwiki linki
: AND page.page_title NOT LIKE "Saksa_ordu_suurmarssalite_loend" #5+ ruwiki linki

--

Commons-relatedRedigeeri

--

--

VariaRedigeeri

statistics
users related
Wikidata

Ideas/ProblemsRedigeeri

Impossible ones?

TeooriaRedigeeri

  • NS 0 Article
  • NS 1
  • NS 2 User
  • NS 3
  • NS 4 Wikipedia
  • NS 5
  • NS 6 File
  • NS 7
  • NS 8 MediaWiki
  • NS 9
  • NS 10 Template
  • NS 11
  • NS 12 Help
  • NS 13
  • NS 14 Category
  • NS 15
  • NS 100 Portal
  • NS 828 Module
  • TH võimalik, et SQL max results give ca 133,000 entries
  • TH SQL saab panna umbes 600-700 rida #As of 2018
  • TH LEFT JOIN

(SELECT * FROM page_props WHERE pp_propname <> 'defaultsort') AS sorted ON pp_page = page_id #TH, et AS-klausel peab olema ja ON-klausel peab olema

  • TH kui pean silmas tabelit, siis nt page ja kui pean silmas tabeli üht rida, siis ",page_title, " #TH pers: ilmselt kui rea väärtus on arvuline, siis pole jutumärke vaja)
  • TH kui nt lisad kaks page-tabelit, siis assigneeri (AS-klausel) nad arusaadavalt, nt page AS talkpage, page AS NS0page
  • TH kui assigneerid, siis mitmesõnalise fraasi puhul ilmselt tuleb jutumärke kasutada
  • TH kui assigneerid SQLis juba kasutatud funktsiooni, nt from, siis tuleb kasutada jutumärke, et töötaks
  • operator precedence = AND ja siis OR-id. OR-idega fraasis pane need sulgudesse, nt AND (X OR Y), mitte ilma sulgudeta. Veel näide: OR (pl_title LIKE "%)" AND pl_title NOT LIKE "%(%")
  • TH AND page_is_redirect = 0 AND page_namespace = 0 #siin otsib ta üles ikkagi kõik redirectid, sest AND käsk on viimane
  • TH AND pl_title LIKE '%keelne_Vikipeedia%' AND pl_title LIKE '%keelne_vikipeedia%' #siis ta otsib vaid linke, mis annab 2. statement
  • WHERE clause can be combined with AND, OR, and NOT operators (w3schools.com)
  • LIKE operator is used in a WHERE clause to search for a specified pattern in a column (w3schools.com)
  • There are two wildcards used in conjunction with the LIKE operator:

% - The percent sign represents zero, one, or multiple characters _ - The underscore represents a single character (w3schools.com)

  • HAVING site LIKE '%.wikipedia.org' (kui tahan oma tehtud tabelist otsida, vt https://quarry.wmflabs.org/query/21061
  • The IN operator allows you to specify multiple values in a WHERE clause (w3schools.com), nt WHERE product_price IN (50,99) või IN (‘Red’, ‘Green’)
  • The IN operator is a shorthand for multiple OR conditions (w3schools.com)
  • AND cl_to IS NULL #zero is not NULL ja space ehk tühik ka mitte, NULL on columns without data
  • hea NULL-käsu kasutus koos EXISTSiga, et eemaldada nt redirectid, vt https://quarry.wmflabs.org/query/21254 (selle koostaja on ka õpetuse seal teinud!, RASKE SQL)

It is not possible to test for NULL values with comparison operators, such as =, <, or <>. We will have to use the IS NULL and IS NOT NULL operators instead.

  • TH et vabastada apostroof/ülakoma: (1) siis pane kahekordselt jutumärgid või apostroofid; (2) teine võimalus ilmselt vabastamisfunktsioonina kasutada kaldkriipsu (/)
  • UPPER, LOWER, INITCAP, eg LOWER(page_title) vs UPPER(page_title)
  • LTRIM #left space’id eemaldab; vs RTRIM
  • TH for commenting 3 possibilities: #Foo; -- Foo --; or /* Foo */
  • TH numeric fields are allowed to use without in quotes
  • PH <> or != #both means not equal
  • AND CHAR_LENGTH(cl_to) > 75
  • CHAR on fikseeritud pikkusega tekst
  • VARCHAR on muutuva pikkusega tekst
  • [charlist] #Sets and ranges of characters to match
  • [!charlist] #Matches only a character NOT specified within the brackets
  • TH AND p1.page_namespace % 2 != 0 #s.t, et paarisarvulised NSid võtab ära
  • TH page_namespace % 2 =1 #võimalik (et 2-se vahega ja alustamine 1-st), et see kuidagi võtab arvesse ns-id 1, 3, 5, 9, 11, 15 ja 103 namespace’id
  • TH eemalda X quarryist: (1) DABid või (2) languagelinks'id või (3a ja 3b) blue/red links
(1) AND pl_title/page_id NOT IN (SELECT pp_page FROM page_props WHERE pp_propname = 'disambiguation')
(2) AND page_id NOT IN (SELECT ll_from FROM langlinks)
(3a) AND pl_title NOT IN (SELECT page_title FROM page WHERE page_namespace = 0)
(3a) nt NS 14 (mis kui red links) eemaldamiseks: AND page_title IS NULL

.(3b) AND (SELECT COUNT(*) FROM page WHERE page_title = pl_title AND page_namespace = pl_namespace AND pl_namespace <>15)= 0 https://quarry.wmflabs.org/query/9481 --

(4) kui tahan otsingust alati kindlat artiklit vms välja jätta, siis AND page_id NOT IN (36422437, 28376039, 5518477, 5518482, 26880825) #Five redirects to the talk namespace which have been kept at WP:RFD
(5) INNER JOIN categorylinks AS cl ON page_id = cl_from WHERE cl_to != "Znamoičendad"
(6) TH et leida languagelinksideta art: AND pp_propname != 'wikibase_item' #https://quarry.wmflabs.org/query/311
(7) TH kuidas välja võtta nt unexisting templates: AND NOT EXISTS (SELECT * FROM page AS p1 WHERE p1.page_namespace = tl_namespace AND p1.page_title = tl_title) #https://quarry.wmflabs.org/query/2218 või tl_title NOT IN (select page_title from page where page_namespace=10) https://quarry.wmflabs.org/query/20396
(8) välja võtta artiklid kindla malliga: WHERE page_id NOT IN (SELECT tl_from FROM templatelinks AS tl WHERE tl_title LIKE "BioTakso_infokaste")
SET @row_number = 0; #SET ja SELECT on seotud asjad, https://stackoverflow.com/questions/3945361/set-versus-select-when-assigning-variables
SELECT (@row_number:=@row_number + 1) #ilmselt võimaldab see päringu iga rea ette panna soovitud numbrit, alustades 0-st ja samm on 1
SELECT CONCAT('',plwiki_p.page.page_title,'') AS pl,
SELECT CONCAT('',enwiki_p.page.page_title,'') AS en
SELECT CONCAT('#',el_to) as el_to
SELECT CONCAT('* [[Mall:Ns:', page namespace, ':', page_title, ']]') as talkpage
SELECT CONCAT('#', page_namespace,':',page_title, '
SELECT CONCAT('[[Mall:Ns:', page namespace, ':', page_title, ']]') #TH NS mängimine ka nii: CONCAT ja CASE-WHEN-THEN-END hea lahendus https://quarry.wmflabs.org/query/28349
SELECT CONCAT('#Kategoria:',cat_title,', ') AS cat_title
SELECT CONCAT('Kategoria', page_title, '') AS page_title
SELECT CONCAT("#Dyskusja:",page_title,"") AS page_title
SELECT CONCAT("",page_title,"&action=delete usuń") AS button
CONCAT('* ', art.page_title ,', Arutelu:', talk.page_title, '') AS talkpage
CONCAT('https://meta.wikimedia.org/w/index.php?action=history&title=Talk:',disc.page_title) AS history
SELECT '* ',page_title,'' #page_title peab olema komade vahel, muidu ei tööta
SELECT GROUP_CONCAT(cl_to SEPARATOR '|' #https://quarry.wmflabs.org/query/26697

SELECT COUNT(DISTINCT(page_title)) JOIN templatelinks ON tl_from = page_id AND tl_namespace = 10 AND tl_title IN ('Link_V', 'Link_VdQ') WHERE page_namespace = 0

  • SELECT DISTINCT statement is used to return only distinct (different) values. (w3schools.com)
  • SHOW databases LIKE "%_p" #see kirjutadagi quarry lahtrisse ja annab kõik databased, mis vikis, vt https://quarry.wmflabs.org/query/19760

TH page_title = p.page_title (kui eri asjad, nt EXISTSiga edukalt: https://quarry.wmflabs.org/query/21252

  • USE etwikisource_p;

--

  • UNION #to merge modified tables?

SELECT City FROM Customers UNION SELECT City FROM Suppliers ORDER BY City; https://quarry.wmflabs.org/query/9688 (raske, UNIONiga ja commonsiga, uurimata mai 2016) UNION ALL-iga https://quarry.wmflabs.org/query/4493

CONVERT (data_type(length),expression,style), ent SQL on nt CONVERT(page_title USING utf8), https://quarry.wmflabs.org/query/9068 when using CONVERT (vs CAST): CAST (expression AS data_type )

LOCATE (‘st’, ‘myteststring’, 3) #3 tähendab, et loendust alustatakse 3. märgist; POSITION() function is a synonym for the LOCATE() function MID(Target.page_title,1,LOCATE('(',Target.page_title)-2), selleks on reegel: MID(string, start, length) REPLACE(p1.page_title, "_", " ")  s.t, 1. argument on otsitav kogu string, 2. argument on mida otsida stringi seest ehk substring ja 3. argument, et mis seda 2. argumenti asendama hakkab, antud juhul otsitakse _ ja see asendatakse mitte millegiga SUBSTRING (expression, start, length ), nt SELECT x = SUBSTRING('abcdef', 2, 3)  return: bcd, sest algab teisest märgist ja sellest kaasa arvatult kolm märki edasi ehk b ja c ja d SELECT SUBSTRING (string, delimiter, count), sealjuures count on integer, mis indicating the nr of occurances of delimiter; (2) count kui on -1 võtab paremalt ja +1 vist võtab vasakult SELECT SUBSTRING ('Sakila' FROM -4 FOR 2); returns: -> 'ki' SUBSTRING INDEX (str, delim, count) SELECT SUBSTRING_INDEX(SUBSTRING(el_to, LOCATE('://', el_to) + 3), '/', 1) url (vt http://quarry.wmflabs.org/query/2949)

date form = dd-mm-yyyy DATE_FORMAT(rev_timestamp,"%Y-%m-%d") AS rev_timestamp AND rev_timestamp >= 20150101000000 AND rev_timestamp <= 20151231235959 #AND r.rev_timestamp BETWEEN 20141019000000 AND 20141119000000 FROM logging WHERE log_type = 'delete' AND log_action = 'delete' AND log_timestamp > ( NOW() - INTERVAL 24 MONTH ) #või AND rc.rc_timestamp >= DATE_SUB(NOW(), INTERVAL 2 HOUR)

  • rev_timestamp > DATE_FORMAT(DATE_SUB(NOW(),INTERVAL 8 DAY),'%Y%m%d%H%i%s')
  • SELECT now();, et saada kätte praegune kuupäev ja seejuures võib kasutada quarry'is kasutada mitut SELECTi
  • SELECT now() AS timestamp; #v-o seda alguses nutikas kasutada; või
  • SELECT CURRENT_DATE; #teine võimalus

-- SELECT 1 FROM Foo_table (pers: vs SELECT * #ilmselt küll üks ja samad need? will return a column of 1's for every row in the table. You could use it with a where statement to check whether you have an entry for a given key, as in: if exists(select 1 from table where some_column = 'some_value') What your friend was probably saying is instead of making bulk selects with select * from table, you should specify the columns that you need precisely, for two reasons: 1) performance & you might retrieve more data than you actually need. 2) the query's user may rely on the order of columns. If your table gets updated, the client will receive columns in a different order than expected. nt https://quarry.wmflabs.org/query/5388

--

  • GROUP BY peab olema enne kui ORDER BY - muidu ei tööta SQL!
  • GROUP BY page_title #nt vajalik, kui annab ühte samu tiitleid palju
  • GROUP BY page_id
  • GROUP BY 1 refers to the first column in select statement which is account_id (www)
  • HAVING this modifier can be used with GROUP BY
  • HAVING COUNT(*) >10, nt HAVING COUNT(pl_from)>2
  • HAVING COUNT(*) <= 1 #st et hõlmab väiksema või võrdne 1-ga
  • HAVING COUNT(cl_from)-COUNT(pp_propname) = 0, nt https://quarry.wmflabs.org/query/20406; veel 1 COUNTiga https://quarry.wmflabs.org/query/21154
  • HAVING imagelinks (AS-iga saadud tabel, COUNT) + links (AS-iga saadud tabel, COUNT) <= 1, nt https://quarry.wmflabs.org/query/20080
  • SELECT COUNT(*), ll_lang from langlinks group by 2 order by 1 desc limit 20;
  • SELECT COUNT ja HAVING variant #https://quarry.wmflabs.org/query/20080
  • ORDER BY RAND()
  • ORDER BY CEIL(LOG10(COUNT(*))), el_index #
  • ORDER BY COUNT(ll_from) DESC
  • ORDER BY 2,1 #it sorts the result by second and first column

See alsoRedigeeri