Data Cleaning

Data Cleaning

Zur Datenbereinigung (engl. data cleansing oder data scrubbing) gehören verschiedene Verfahren zum Entfernen und Korrigieren von Datenfehlern in Datenbanken oder anderen Informationssystemen. Die Fehler können beispielsweise aus inkorrekten (ursprünglich falschen oder veralteten), redundanten, inkonsistenten oder falsch formatierten Daten bestehen.

Wesentliche Schritte zur Datenbereinigung sind die Duplikaterkennung (Erkennen und Zusammenlegen von gleichen Datensätzen) und Datenfusion (Zusammenführen und Vervollständigen lückenhafter Daten).

Die Datenbereinigung ist ein Beitrag zur Verbesserung der Informationsqualität. Allerdings betrifft Informationsqualität auch viele weitere Eigenschaften von Datenquellen (Glaubwürdigkeit, Relevanz, Verfügbarkeit, Kosten...), die sich mittels Datenbereinigung nicht verbessern lassen.

Siehe auch


Wikimedia Foundation.

Игры ⚽ Поможем сделать НИР

Schlagen Sie auch in anderen Wörterbüchern nach:

  • Data profiling — is the process of examining the data available in an existing data source (e.g. a database or a file) and collecting statistics and information about that data. The purpose of these statistics may be to: Find out whether existing data can easily… …   Wikipedia

  • Data migration — is the process of transferring data between storage types, formats, or computer systems. Data migration is usually performed programmatically to achieve an automated migration, freeing up human resources from tedious tasks. It is required when… …   Wikipedia

  • Data analysis — Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making. Data analysis has multiple facets and approaches,… …   Wikipedia

  • Data cleansing — Not to be confused with Sanitization (classified information). Data cleansing, data cleaning, or data scrubbing is the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set, table, or database. Used… …   Wikipedia

  • Data mining — Not to be confused with analytics, information extraction, or data analysis. Data mining (the analysis step of the knowledge discovery in databases process,[1] or KDD), a relatively young and interdisciplinary field of computer science[2][3] is… …   Wikipedia

  • Data processing — otheruses|Data entry clerk Data processing is any computer process that converts data into information or knowledge. [i.e. data processing can be any computer operation or series of operations performed on data to get insightful information.] The …   Wikipedia

  • Data clarification form — A Data Clarification Form (DCF) or Data Query Form (DQF) is a questionnaire specifically used in clinical research. The DCF is the primary data clarification tool from the trial sponsor or Contract Research Organization (CRO) towards the… …   Wikipedia

  • data cleansing — / deɪtə ˌklenzɪŋ/, data cleaning / deɪtə ˌkli:nɪŋ/ noun checking data to make sure it is correct …   Marketing dictionary in english

  • Data warehouse — Overview In computing, a data warehouse (DW) is a database used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations… …   Wikipedia

  • Cleaning Up (film) — Cleaning Up Directed by Fatty Arbuckle (as William Goodrich) Starring Johnny Arthur Release date(s) 22 November 1925 …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”