philosophy of statistics philosophy of science replication crisis severity p-values error statistics severe testing statistical inference statistics significance tests likelihood principle r. a. fisher deborah mayo role of probability in inference statistical significance tests foundations of statistics big data replicability bayes factors biasing selection effects confidence intervals asa 2016 statement on p-values error probabilities neyman-pearson statistical methodology evidence reproducibility induction statistical reforms statistics wars bayesian inference popper fisher problem of induction replication paradox lse ph500 psychology association for psychological science (aps) 2015 c j. neyman richard royall higgs boson falsification d. mayo experimental philosophy reformulation of statistical tests error control sir david cox frequentist statistics bayesian statistics asa error statistics statistical inference significance testing bernoulli trials reproducibility psychology stephen senn confirmation meta-methodology aris spanos e. pearson roles of probability in inference casual inference repligate modeling machine learning statistical modus tollens calibration selection effects background assumptions testing reasoning jeffreys-lindley paradox statistics war statistical inference as severe testing replication reforms double-counting higgs discovery severity vs. rubbing off capabilities of methods fallacies of rejection statistical fraud busting data mining capability & severity logic non-rejection p-hacking bayesian vs frequentist statistics likelihood default priors american statistical association fisherian tests duality of tests & confidence intervals (cis) reliability asa task force statement 2021 stopping rules revised role of error probabilities frequentist inference data-dredging use-constructed psa 2018 statistical crisis in science large-n problem p-values exaggerate esp-bem-bayes likelihood principle violations predesignation bad evidence no test (bent) law of likelihood severity criterion formal epistemology carnap demarcation inductive logicians vs deductive testers enumerative induction asymmetry of falsification justifying induction uniformity of nature measures of confirmation logic of statistical inference paradox of irrelevant conjunctions glymour hacking frequentist principle of evidence (fev) sev large n-problem interpreting negative results optional stopping unification valid vs. invalid argument premises modus tollens modus ponens disjunctive syllogism soundness logic of simple significance tests deductively valid argument categorical syllogism version of modus ponens detach conclusion affirming the consequent exercises parameter likelihood ratio auxiliary hypotheses human medical observational studies failure to replicate phd stan young edwards deming data analysis scientific integrity geraerts "memory paper" affair smeesters affair debunking scientism statisticism mayo d. birnbaum logical flaws statistical foundations p values frequentists statistics probability scientific method(s) inference uses of probability estimation point interval estimation central limit theorem gold standard estimation hypothesis testing power severity excel program inductive inference null hypothesis fallacies of acceptance and rejection forensic anthropology histomorphology accept/reject procedures mis-specification (m-s) testing duhem's problem spurious correlation probativism nancy reid normative epistemology fidicual peirce inductive logic n-p testing critical challenges-replies biased selection effects pre-registration overstate evidence against the null posterior probability sampling distribution intentions epidemiology industrialization of the scientific process bright lines paradigm shift data-driven science biased data spanos replication technical activism nhst nonsignificant results liklihood principle connecting statistical claims to causal claims fraudbusting scientific methodology role for philosophers exploratory research reliablity false positivies bayes statistical fallacies error correction fallacies of non-statistically significant results columbia power vs severity post-data workshop on probability and learning novel evidence economics 2 cultures misspecification testing weight of evidence ipcc margherita harris keynes mathematical abstractions invalid inference transparency math literally vs. researchers mixed models simulations right question complexity meta-analysis meta-science meta-research information-compression logic fanelli methods & theories use tools correctly don't ban tools best measures vary questionable research practice r.a. likelihood priniciple nature of probability multiplicity legitimate data-dredging psa 22 texas sharpshooter fallacy confidence distributions credible intervals frequentists inference min-ge xie suzanne thornton psa 2022 clark glymour data dredging regression graphical model searches multiple hypothesis tests cmu james o. berger multiplicity in science gwas prior probabilities subgroup analyzes methodological probability bayesianism philosophy & practice neyman seminar jerzy neyman egon pearson problems of replication learning from error e. lehmann asa on p-values auditing model assumptions falsifiability ai ml artificial intelligence artificial intelliigence (ai) machine learning (ml) p-value controversy 2 way street stat & phil series of models statistical falsification neyman & pearson ij good larry laudan bonferroni qrps changing methodology local inference local methodology grp-qrps-fraud abolish qrps evidential pluralism stephan guttinger evidential plualism fallacious inferences associations jon williamson gatekeepers errors malfunction disclaimers foundations theoretical statistics statistical analysis statistical philosophy statistical testing in psych statistical testing research methods criticisms of p-values large sample size fallacies of non-rejection likelihoodists vs. significance testers frequentist feuds confidence intervals-problems new justification for ci fev/sev 5 sigman effect lindley o'hagan high energy particle physics severity interpretation of a rejection test t+ redefine statistical significance bayes/fisher disagreement relationship power & sample size j. berger and sellke do p-values exaggerate the evidence? casella and r. berger spike & smear bent: bad evidence-no test p-value vs posterior jeffreys type prior berger and delampady contrasting bayes factors sensitivity function howlers & chestnuts duality tests & conf. intervals 5 sigma effect statistical fluctuations fisher's testing principle p-value police sev principle for statistical significance look elsewhere effect (lee) key statistical conflicts fishing spurious p-values 21 word solution fisher-neyman & pearson dispute j. berger peace treaty diagnostic screening (ds) model of tests irreplication statistics battles ian hacking statistics debates erich lehmann role for probability in statistical inference e. s. pearson 3 steps in n-p tests pradeu editors role bias replicability & significance wasserstein wsl 2019 editorial socially aware data science r.a. fisher fiducial probability skeptical user statistis wars yoav benjamini selective inference relevant varability nejm guidelines daniel lakens replication studies c. hennig mathematical modeling models statistical tests' neyman pearson egon probativeness water plant example matching likelihood ratio generous to alternative default posterior probability david hand reproducibility crisis trust trustworthiness preregistration novelty bem meehl de groot registered reports covid-19 lab origins latham & wilson xu severe tests eclipse tests gtr bibliometrics lemoine philosophy in science (pins) psa 2021
See more