ϟ

Data mining (Page 1)

process of discovering patterns in large data sets using computational methods at the intersection of statistics, database systems, or machine learning

  1. Explore » 
  2. computer science » 
  3. Data mining
Subconcepts:
  1. Alias
  2. Analytics
  3. Anomaly detection
  4. Association rule learning
  5. Automatic Data Processing
  6. Automatic Identification System
  7. Bibliometrics
  8. Big data
  9. Business intelligence
  10. Cardinality (data modeling)
  11. Classification rule
  12. Computer graphics
  13. Data analysis
  14. Database design
  15. Data classification
  16. Data correlation
  17. Data cube
  18. Data curation
  19. Data integration
  20. Data management
  21. Data pre-processing
  22. Data reduction
  23. Data reliability
  24. Data retrieval
  25. Data source
  26. Data stream mining
  27. Data system
  28. Data verification
  29. Data warehouse
  30. Decision support system
  31. Decision tree
  32. Dempster–Shafer theory
  33. Diagnostic model
  34. Differential privacy
  35. Dimensional modeling
  36. Exploratory data analysis
  37. First class
  38. Hierarchical database model
  39. Industry 4.0
  40. Infographic
  41. Information aggregation
  42. Information gain
  43. Information integration
  44. Intrusion detection system
  45. Knowledge extraction
  46. Learning analytics
  47. Lift (data mining)
  48. Longitudinal data
  49. Market intelligence
  50. Master data
  51. Measure (data warehouse)
  52. Model parameter
  53. Multidimensional data
  54. Nearest neighbor search
  55. Nested loop join
  56. Network model
  57. Null (SQL)
  58. Patent visualisation
  59. Problem solving environment
  60. Prognostics
  61. Query language
  62. Query optimization
  63. Reference data
  64. Relational database
  65. Relation (database)
  66. Rough set
  67. Rule induction
  68. Sequential Pattern Mining
  69. Similarity measure
  70. Skyline
  71. Small data
  72. Streaming data
  73. Table (database)
  74. Temporal database
  75. Text mining
  76. Time sequence
  77. Uncertain data
  78. Vector space model
  79. Very large database
  80. Visualization
  81. Visual reasoning
  82. Webometrics
Papers in this category: 1 835 571 Current Page: 1 / 100
DOI: 10.1145/1355734.1355746
2008
Cited 7604 times
OpenFlow
DOI: 10.1016/0022-2496(77)90033-5
1977
Cited 7598 times
A scaling method for priorities in hierarchical structures
DOI: 10.1287/isre.2.3.192
1991
Cited 7562 times
Development of an Instrument to Measure the Perceptions of Adopting an Information Technology Innovation
DOI: 10.1186/gb-2003-4-5-p3
2003
Cited 7545 times
DAVID: Database for Annotation, Visualization, and Integrated Discovery
DOI: 10.1126/science.3287615
1988
Cited 7537 times
Measuring the Accuracy of Diagnostic Systems
DOI: 10.1093/bioinformatics/19.2.185
2003
Cited 7536 times
A comparison of normalization methods for high density oligonucleotide array data based on variance and bias
DOI: 10.1186/1471-2105-11-119
2010
Cited 7490 times
Prodigal: prokaryotic gene recognition and translation initiation site identification
DOI: 10.1017/cbo9780511806384
2002
Cited 7359 times
Experimental Design and Data Analysis for Biologists
DOI: 10.1093/bioinformatics/btp348
2009
Cited 7321 times
trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses
DOI: 10.2307/2531248
1986
Cited 7280 times
Longitudinal Data Analysis for Discrete and Continuous Outcomes
DOI: 10.1038/s41467-019-09234-6
2019
Cited 7202 times
Metascape provides a biologist-oriented resource for the analysis of systems-level datasets
DOI: 10.1186/1471-2105-14-7
2013
Cited 7176 times
GSVA: gene set variation analysis for microarray and RNA-Seq data
DOI: 10.1155/2011/156869
2011
Cited 7170 times
FieldTrip: Open Source Software for Advanced Analysis of MEG, EEG, and Invasive Electrophysiological Data
DOI: 10.1186/1471-2105-5-113
2004
Cited 7168 times
DOI: 10.1108/ebr-11-2018-0203
2019
Cited 7158 times
When to use and how to report the results of PLS-SEM
DOI: 10.1016/j.cels.2015.12.004
2015
Cited 7131 times
The Molecular Signatures Database Hallmark Gene Set Collection
DOI: 10.1111/j.2006.0906-7590.04596.x
2006
Cited 7090 times
Novel methods improve prediction of species’ distributions from occurrence data
DOI: 10.1177/1098214005283748
2006
Cited 7072 times
A General Inductive Approach for Analyzing Qualitative Evaluation Data
DOI: 10.1016/j.patrec.2009.09.011
2010
Cited 6900 times
Data clustering: 50 years beyond K-means
DOI: 10.1109/tcom.1980.1094577
1980
Cited 6889 times
An Algorithm for Vector Quantizer Design
DOI: 10.1016/0894-1777(88)90043-x
1988
Cited 6802 times
Describing the uncertainties in experimental results
DOI: 10.1108/eb046814
1980
Cited 6760 times
An algorithm for suffix stripping
DOI: 10.1109/cvpr.2017.634
2017
Cited 6735 times
Aggregated Residual Transformations for Deep Neural Networks
DOI: 10.1093/bib/bbs017
2012
Cited 6702 times
Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration
DOI: 10.1093/nar/gkw377
2016
Cited 6621 times
Enrichr: a comprehensive gene set enrichment analysis web server 2016 update
DOI: 10.1016/j.molp.2020.06.009
2020
Cited 6601 times
TBtools: An Integrative Toolkit Developed for Interactive Analyses of Big Biological Data
DOI: 10.1002/wics.101
2010
Cited 6508 times
Principal component analysis
DOI: 10.1145/361219.361220
1975
Cited 6473 times
A vector space model for automatic indexing
DOI: 10.1093/bioinformatics/bts565
2012
Cited 6451 times
CD-HIT: accelerated for clustering the next-generation sequencing data
DOI: 10.1006/jmbi.2000.4042
2000
Cited 6429 times
T-coffee: a novel method for fast and accurate multiple sequence alignment 1 1Edited by J. Thornton
DOI: 10.1016/j.jneumeth.2007.03.024
2007
Cited 6401 times
Nonparametric statistical testing of EEG- and MEG-data
DOI: 10.1109/21.87068
1988
Cited 6377 times
On ordered weighted averaging aggregation operators in multicriteria decisionmaking
DOI: 10.1016/j.neuroimage.2011.10.018
2012
Cited 6322 times
Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion
DOI: 10.1007/bf02289565
1964
Cited 6321 times
Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis
DOI: 10.1093/bioinformatics/btt086
2013
Cited 6305 times
QUAST: quality assessment tool for genome assemblies
DOI: 10.1016/c2009-0-19715-5
2011
Cited 6264 times
Data Mining: Practical Machine Learning Tools and Techniques
DOI: 10.1107/s002188980600731x
2006
Cited 6231 times
<i>Mercury</i>: visualization and analysis of crystal structures
DOI: 10.1002/sim.4067
2010
Cited 6225 times
Multiple imputation using chained equations: Issues and guidance for practice
DOI: 10.7717/peerj.2584
2016
Cited 6205 times
VSEARCH: a versatile open source tool for metagenomics
DOI: 10.1016/0034-4257(91)90048-b
1991
Cited 6168 times
A review of assessing the accuracy of classifications of remotely sensed data
DOI: 10.1017/s0033822200033865
2009
Cited 6164 times
Bayesian Analysis of Radiocarbon Dates
DOI: 10.1109/tpami.1979.4766909
1979
Cited 6144 times
A Cluster Separation Measure
DOI: 10.1109/tkde.2008.239
2009
Cited 6139 times
Learning from Imbalanced Data
DOI: 10.1177/108705719900400206
1999
Cited 6123 times
A Simple Statistical Parameter for Use in Evaluation and Validation of High Throughput Screening Assays
DOI: 10.1177/117693430500100003
2005
Cited 6090 times
Arlequin (version 3.0): An integrated software package for population genetics data analysis
DOI: 10.1016/s1361-8415(01)00036-6
2001
Cited 6040 times
A global optimisation method for robust affine registration of brain images
DOI: 10.1093/biostatistics/kxj037
2006
Cited 5996 times
Adjusting batch effects in microarray expression data using empirical Bayes methods
DOI: 10.1016/s0092-6566(03)00046-1
2003
Cited 5992 times
A very brief measure of the Big-Five personality domains
DOI: 10.1101/gr.8.3.175
1998
Cited 5979 times
Base-Calling of Automated Sequencer Traces Using<i>Phred.</i> I. Accuracy Assessment
DOI: 10.1017/cbo9780511811241
2005
Cited 5942 times
Microeconometrics
DOI: 10.1023/a:1014573219977
2002
Cited 5937 times
DOI: 10.1186/1758-2946-4-17
2012
Cited 5880 times
Avogadro: an advanced semantic chemical editor, visualization, and analysis platform
DOI: 10.1093/clinchem/39.4.561
1993
Cited 5865 times
Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine
DOI: 10.1093/bioinformatics/btn129
2008
Cited 5824 times
<i>adegenet</i>: a R package for the multivariate analysis of genetic markers
MAG: 2125055259
1992
Cited 5798 times
C4.5: Programs for Machine Learning
DOI: 10.1093/bioinformatics/btg359
2003
Cited 5794 times
DnaSP, DNA polymorphism analyses by the coalescent and other methods
DOI: 10.1186/gb-2010-11-3-r25
2010
Cited 5774 times
A scaling normalization method for differential expression analysis of RNA-seq data
DOI: 10.1126/science.1136800
2007
Cited 5681 times
Clustering by Passing Messages Between Data Points
DOI: 10.1111/j.2041-210x.2009.00001.x
2009
Cited 5670 times
A protocol for data exploration to avoid common statistical problems
MAG: 2187089797
2008
Cited 5658 times
Visualizing Data using t-SNE
DOI: 10.1103/physreve.70.066111
2004
Cited 5653 times
Finding community structure in very large networks
DOI: 10.2307/2289883
1989
Cited 5634 times
Statistical Analysis With Missing Data
DOI: 10.1007/bf01908075
1985
Cited 5595 times
Comparing partitions
DOI: 10.1093/bioinformatics/btm308
2007
Cited 5591 times
TASSEL: software for association mapping of complex traits in diverse samples
DOI: 10.1109/72.991427
2002
Cited 5562 times
A comparison of methods for multiclass support vector machines
DOI: 10.1038/nmeth.3901
2016
Cited 5530 times
The Perseus computational platform for comprehensive analysis of (prote)omics data
DOI: 10.1038/nprot.2010.5
2010
Cited 5498 times
I-TASSER: a unified platform for automated protein structure and function prediction
DOI: 10.1186/s40537-019-0197-0
2019
Cited 5451 times
A survey on Image Data Augmentation for Deep Learning
DOI: 10.1021/j100377a021
1990
Cited 5439 times
Reaction path following in mass-weighted internal coordinates
DOI: 10.1177/002224377901600110
1979
Cited 5437 times
A Paradigm for Developing Better Measures of Marketing Constructs
DOI: 10.1080/01621459.1988.10478722
1988
Cited 5435 times
A Test of Missing Completely at Random for Multivariate Data with Missing Values
DOI: 10.18637/jss.v045.i03
2011
Cited 5420 times
<b>mice</b>: Multivariate Imputation by Chained Equations in<i>R</i>
DOI: 10.1021/cr00005a013
1991
Cited 5419 times
A quantum theory of molecular structure and its applications
DOI: 10.1093/nar/gkh293
2004
Cited 5388 times
ARB: a software environment for sequence data
DOI: 10.1093/bioinformatics/btw313
2016
Cited 5372 times
Complex heatmaps reveal patterns and correlations in multidimensional genomic data
DOI: 10.1002/sim.2929
2007
Cited 5328 times
Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond
DOI: 10.18637/jss.v025.i01
2008
Cited 5308 times
<b>FactoMineR</b>: An<i>R</i>Package for Multivariate Analysis
DOI: 10.1093/bioinformatics/btm233
2007
Cited 5275 times
CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure
DOI: 10.1109/34.709601
1998
Cited 5251 times
The random subspace method for constructing decision forests
DOI: 10.1017/cbo9780511790942
2006
Cited 5212 times
Data Analysis Using Regression and Multilevel/Hierarchical Models
DOI: 10.1016/s0001-2998(78)80014-2
1978
Cited 5161 times
Basic principles of ROC analysis
DOI: 10.1006/jmbi.1999.3091
1999
Cited 5143 times
Protein secondary structure prediction based on position-specific scoring matrices 1 1Edited by G. Von Heijne
DOI: 10.1109/tse.1976.233837
1976
Cited 5132 times
A Complexity Measure
DOI: 10.1148/radiol.2015151169
2016
Cited 5119 times
Radiomics: Images Are More than Pictures, They Are Data
DOI: 10.1038/nmeth.4169
2017
Cited 5097 times
cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination
DOI: 10.1021/cr00031a013
1994
Cited 5079 times
Molecular Interactions in Solution: An Overview of Methods Based on Continuous Distributions of the Solvent
DOI: 10.1007/s11263-013-0620-5
2013
Cited 5078 times
Selective Search for Object Recognition
DOI: 10.1016/b978-0-08-032599-6.50008-8
1985
Cited 5071 times
The Analytic Hierarchy Process
DOI: 10.21314/jor.2000.038
2000
Cited 5065 times
Optimization of conditional value-at-risk
DOI: 10.1093/bioinformatics/btr509
2011
Cited 5055 times
A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data
DOI: 10.1609/icwsm.v3i1.13937
2009
Cited 5050 times
Gephi: An Open Source Software for Exploring and Manipulating Networks
DOI: 10.1109/tnn.2005.845141
2005
Cited 5050 times
Survey of Clustering Algorithms
DOI: 10.1111/nhs.12048
2013
Cited 5049 times
Content analysis and thematic analysis: Implications for conducting a qualitative descriptive study
DOI: 10.1214/ss/1177012413
1989
Cited 5043 times
Design and Analysis of Computer Experiments
DOI: 10.1017/cbo9780511809682
2004
Cited 5009 times
Kernel Methods for Pattern Analysis
DOI: 10.1214/aoms/1177698950
1967
Cited 4992 times
Upper and Lower Probabilities Induced by a Multivalued Mapping
DOI: 10.1016/0098-3004(84)90020-7
1984
Cited 4959 times
FCM: The fuzzy c-means clustering algorithm
DOI: 10.1371/journal.pone.0021800
2011
Cited 4956 times
REVIGO Summarizes and Visualizes Long Lists of Gene Ontology Terms
DOI: 10.1109/34.667881
1998
Cited 4953 times
On combining classifiers
DOI: 10.1093/nar/gkw257
2016
Cited 4950 times
deepTools2: a next generation web server for deep-sequencing data analysis
Papers in this category: 1 835 571 Current Page: 1 / 100