ModsCZ4125
Developing Data Products
Data products are products whose principal thrust is to use data to facilitate certain end goals. These products may exhibit varied levels of complexity, spanning from being raw and derived data, intelligence and services derived from or driven by data, enabled by algorithmic and statistical tools such as machine learning. In many data products, the role of data can be subtle and the overall product may be composed of other modules, which could be more prominent in appearance.
The scope of this course is confined to the exploration of the life-cycle of typical data products that deploy data-science dominantly or exclusively. We will deconstruct individual components across the full stack: ranging from data acquisition, wrangling and storage, testing, validation, and refinement - spanning exploratory analysis, issues of visualization and presentation, and application of machine learning techniques for decision support, anomaly detection and recommendation systems. We will explore this course with hands-on examples, illustrating how some data products utilize only a subset of these components, while others deploy the wider gamut.
This course will also expose techniques to handle various kinds of data on their own and in conjunction, e.g., natural language data, datetime and timeseries, geoseries and graph data; the underlying systems and algorithms to support, analyze and learn using such data to build data products.
Overall, in this course, we will describe several fundamental principles, and we will illustrate them in action using hands-on examples and tools. The choice of algorithms and software packages will be representative but not exhaustive. As such, the detailed description of course content uses 'e.g.,' to give possible indicative instances, but the actual delivery of the course may choose other tools, programming languages, frameworks and examples (including or excluding the indicative instances). Furthermore, the instantiation of tools and examples could vary over time, based on practical considerations, e.g., maturity and popularity of the tools among practitioners.
The course will be delivered inter-mixing live lectures and hands-on exercises blended with some parts of the lectures offered in pre-recorded TEL format and curated reading materials.
The scope of this course is confined to the exploration of the life-cycle of typical data products that deploy data-science dominantly or exclusively. We will deconstruct individual components across the full stack: ranging from data acquisition, wrangling and storage, testing, validation, and refinement - spanning exploratory analysis, issues of visualization and presentation, and application of machine learning techniques for decision support, anomaly detection and recommendation systems. We will explore this course with hands-on examples, illustrating how some data products utilize only a subset of these components, while others deploy the wider gamut.
This course will also expose techniques to handle various kinds of data on their own and in conjunction, e.g., natural language data, datetime and timeseries, geoseries and graph data; the underlying systems and algorithms to support, analyze and learn using such data to build data products.
Overall, in this course, we will describe several fundamental principles, and we will illustrate them in action using hands-on examples and tools. The choice of algorithms and software packages will be representative but not exhaustive. As such, the detailed description of course content uses 'e.g.,' to give possible indicative instances, but the actual delivery of the course may choose other tools, programming languages, frameworks and examples (including or excluding the indicative instances). Furthermore, the instantiation of tools and examples could vary over time, based on practical considerations, e.g., maturity and popularity of the tools among practitioners.
The course will be delivered inter-mixing live lectures and hands-on exercises blended with some parts of the lectures offered in pre-recorded TEL format and curated reading materials.
| AUs | 3.0 AUs |
| Exam | N/A |
| Grade Type | N/A |
| Maintaining Dept | N/A |
| Prerequisites | , MH2500 |
| Mutually Exclusive With | N/A |
| Not Available To Programme | N/A |
| Not Available To All Programme With | (Admyr 2021-onwards) |
| Not available as Core for programmes | N/A |
| Not Available as PE for programmes | N/A |
| Not Available as BDE/UEs for programmes | N/A |
| Not Offered To | N/A |
Total hours per week: 3 hrs
Available Indexes
| Mon | Tue | Wed | Thu | Fri | |
|---|---|---|---|---|---|
| 1430 | COMMON LEC (SCL4) 1430-1620 Wed TR+95 | ||||
| 1500 | |||||
| 1530 | |||||
| 1600 | |||||
| 1630 | 10459 TUT (SCEL) 1630-1720 Wed TR+95 | ||||
| 1700 |
Other Relevant Mods
CZ1016
Introduction To Data Science
CZ1103
Introduction To Computational Thinking & Programming
CZ1104
Linear Algebra For Computing
CZ1105
Digital Logic
CZ1106
Computer Organisation & Architecture
CZ2001
Algorithms
CZ2002
Object Oriented Design & Programming
CZ2003
Computer Graphics & Visualisation
CZ2004
Human Computer Interaction