IBM InfoSphere Advanced QualityStage V11.5 SPVC – 2M413GSPVC
Enrol Details
Course Code: 2M413G Brand: Cloud Data Services Category: Analytics Skill Level: Advanced Duration: 24H Modality:SPVC Audience
The intended audience for this course are:_x000D_
• QualityStage programmers_x000D_
• Data Analysts responsible for data quality using QualityStage_x000D_
• Data Quality Architects_x000D_
• Data Cleansing Developers_x000D_
• Data Quality Developers needing to customize QualityStage rule sets
Prerequisites
Participants should have:_x000D_
• Compled the QualityStage Essentials course, or have equivalent experience_x000D_
• familiarity with Windows and a text editor_x000D_
• familiarity with elementary statistics and probability concepts (desirable but not essential)
Short Summary
Learn about the IBM InfoSphere Advanced QualityStage V11.5 data cleansing process Overview
Contains: PDF course guide, as well as a lab environment where students can work through demonstrations and exercises at their own pace.
_x000D_
_x000D_
This course will step you through the QualityStage data cleansing process. You will transform an unstructured data source into a format suitable for loading into an existing data target. You will cleanse the source data by building a customer rule set that you create and use that rule set to standardize the data. You will next build a reference match to relate the cleansed source data to the existing target data.
_x000D_
_x000D_
If you are enrolling in a Self Paced Virtual Classroom or Web Based Training course, before you enroll, please review the Self-Paced Virtual Classes and Web-Based Training Classes on our Terms and Conditions page, as well as the system requirements, to ensure that your system meets the minimum requirements for this course. http://www.ibm.com/training/terms
Topic
After completing this course, you should be able to:• Modify rule sets• Build custom rule sets• Standardize data using the custom rule set• Perform a reference match using standardized data and a reference data set• Use advanced techniques to refine a Two-source match
Objectives
1: QualityStage Review • Course project • QualityStage review • Data Quality • Master Data Management • Investigate • Standardize • Match
2: Structure of a Rule Set • Rule Sets and Rule Set files • Classes and Classification tables • Thresholds • Dictionary files • Pattern action files • Optional tables
3: Creation of a Custom Rule Set • Custom Rule Set development cycle • Investigate data file • Parsing • SEPLIST/STRIPLIST updates
4: Initial Investigation of Data to Be Standardized • Word Investigation • Pattern report • Token report
5: Classification Table • Create the Classification Table • Classification schema • What to classify • Process • Resulting Classification File with Legend • Pattern review: refining the Classification Table
6: Pattern Action File • Pattern Action Language • Development of Pattern Action Sets • Refining Pattern Action Sets • Investigation of Standardized Results
7: Standardization Rules Designer • What is Standardization Rules Designer or SRD? • Using the SRD • SRD work areas • Rule Set revision and selection • Embedded assistance
8: Match Frequency • Match frequency job • Column mapping • Match frequency data set • Using match frequencies in a match job
9: Two-Source (Reference Match) Advanced Implementation • Create a reference match between standardized product data and warehouse data • Refine the match results using the description fields of the standardized product data and the warehouse data.