Home
Search results “Data mining by ian h witten eibe frank mark a hall”
Data mining
 
47:46
Data mining (the analysis step of the "Knowledge Discovery in Databases" process, or KDD), an interdisciplinary subfield of computer science, is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Aside from the raw analysis step, it involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating. The term is a misnomer, because the goal is the extraction of patterns and knowledge from large amount of data, not the extraction of data itself. It also is a buzzword, and is frequently also applied to any form of large-scale data or information processing (collection, extraction, warehousing, analysis, and statistics) as well as any application of computer decision support system, including artificial intelligence, machine learning, and business intelligence. The popular book "Data mining: Practical machine learning tools and techniques with Java" (which covers mostly machine learning material) was originally to be named just "Practical machine learning", and the term "data mining" was only added for marketing reasons. Often the more general terms "(large scale) data analysis", or "analytics" -- or when referring to actual methods, artificial intelligence and machine learning -- are more appropriate. This video is targeted to blind users. Attribution: Article text available under CC-BY-SA Creative Commons image source in video
Views: 1685 Audiopedia
Mutual information
 
24:40
In probability theory and information theory, the mutual information or (formerly) transinformation of two random variables is a measure of the variables' mutual dependence. The most common unit of measurement of mutual information is the bit. This video is targeted to blind users. Attribution: Article text available under CC-BY-SA Creative Commons image source in video
Views: 4797 Audiopedia