jacques cohen specif2007
Updating Computer
Science Education
Jacques Cohen
Brandeis University
Waltham, MA
USA
January 2007
Topics
Preliminary remarks
Present state of affairs and
concerns
Objectives of this talk
Trends (hardware, software, networks,
others)
Illustrative examples
Suggestions
Present state of affairs and
concerns
Huge increase in PC and internet usage.
Decreasing enrollment.
(USA mainly)
Possible Reasons
Previous high school preparation
Bubble burst (2000) + outsourcing
Widespread usage of computers
by lay persons
Interest in interdisciplinary
topics (e.g., biology, business,
economics)
Public perception about:
What is Computer Science?
The Nature of Computer
Science
Two main components:
Theoretical
and
Experimental
Mathematics and
Engineering
What characterizes CS is the notion of
Algorithms
Emphasis on the discrete and logic
An interdisciplinary approach with other
sciences may well revive the interest on
the continuous (or use of qualitative
reasoning)
Related fields
Sciences in general (scientific
computing),
Management,
Psychology (human interaction),
Business,
Communications,
Journalism,
Arts, etc.
The role of Computer
Science among other
sciences
(How we are perceived by the other sciences)
In physics, chemistry, biology, nature
is the ultimate umpire.
Discovery is paramount
In math and engineering: aesthetics,
ease of use, acceptance,
permanence, play key roles
Uneasy dialogue with
biologists
It is not unusual to hear from a
physicist, chemist or biologist:
“If computer scientists do not get
involved in our field, we will do it
ourselves!!”
It looks very likely that the
biological sciences (including, of
course, neuroscience) will
dominate the 21st century
Differences in approaches
Most scientific and creative discoveries
proceed in a bottom-up manner
Computer scientists are taught to
emphasize top-down approaches
Polya’s “How to solve it” often mentions
First specialize then generalize .
Hacking is beautiful (mostly bottom-up)
Objectives
Provide a bird’s eye view of what is
happening in CS education (USA) and
attempt to make recommendations
about possible directions. Hopefully,
some of it would be applicable to
European universities.
Premise
Changes ought to be gradual and
depend on resources and time
constraints
First we have to observe current
trends
Generality, Storage, Speed, Networks, others.
Trying to make sense of present
directions.
Difficult and risky to foresee future,
e.g., PC (windows, mouse), internet,
parallelism
Topics influencing computer
science education.
Trends in hardware, software,
networks.
Huge volume of data
(terabytes and petabytes)
Statistical nature of data
Clustering, classification
Probability and Statistics
become increasingly
important
Trend towards generality
Need to know more about what is
going on in related topics
A few examples:
Robotics and mechanical engineering
Hardware, electrical engineering,
material science, nanotechnology
Multi-field visualization (e.g., medicine)
Biophysics and bioinformatics
Nature of data structures
Sequences (strings), streams
Trees, DAGs, and Graphs
3D structures
Emphasis in discrete structures
Neglect of the continuous
should be corrected (e.g., use of
MatLab)
Trends on data growth
How Much Information Is There In the
World?
The 20-terabyte size of the Library
of Congress derived by assuming
that LC has 20 million books and
each requires 1 MB. Of course, LC
has much other stuff besides printed
text, and this other stuff would take
much more space.
From Lesk
http://www.lesk.com/mlesk/ksg97/ksg.html
Library of Congress data
(cont)
1. Thirteen million photographs, even if
compressed to a 1 MB JPG each, would be 13
terabytes.
2. The 4 million maps in the Geography Division
might scan to 200 TB.
3. LC has over five hundred thousand movies; at
1 GB each they would be 500 terabytes (most
are not full-length color features).
4. Bulkiest might be the 3.5 million sound
recordings, which at one audio CD each, would
be almost 2,000 TB.
This makes the total size of the Library perhaps
about 3 petabytes (3,000 terabytes).
How Much Information Is There In the
World?
Lesk’s Conclusions
There will be enough disk space
and tape storage in the world to
store everything people write,
say, perform or photograph. For
writing this is true already; for the
others it is only a year or two
away.
Lesk’s Conclusions (cont)
The challenge for librarians and
computer scientists is to let us find
the information we want in other
people's work; and the challenge for
the lawyers and economists is to
arrange the payment structures so
that we are encouraged to use the
work of others rather than re-create it.
The huge volume of data
implies:
Linearity of algorithms is a must
Emphasis in pattern matching
Increased preprocessing
Different levels of memory transfer
rates
Algorithmic incrementality (avoid redoing
tasks)
Need of approximate algorithms
(optimization)
Distributed computing
Centralized parallelism (Blue Gene, Argonne)
The importance of pattern
matching (searches) in large
number of items
Pattern matching has to be “tolerant” (approximate)
Find closest matches (dynamic programming,
optimization)
Sequences
Pictures
3D structures (e.g. proteins)
Sound
Photos
Video
Trends in computer cycles
(speed)
Moore’s law appears to be applicable until at
least 2020
Use of supercomputers
(2006)
Researchers at Los Alamos National
Laboratory have set a new world's record
by performing the first million-atom
computer simulation in biology. Using the
"Q Machine" supercomputer, Los Alamos
computer scientists have created a
molecular simulation of the cell's proteinmaking structure, the ribosome. The
project, simulating 2.64 million atoms
in motion, is more than six times larger
than any biological simulations performed
to date.
Graphical visualization of the
simulation of a Ribosome at
work
Network transmission
speed (Lambda Rail Net)
USA backbone
Trends in Transmission Speed
The High Energy Physics
team's demonstration
achieved a peak throughput of
151 Gbps and an official mark
of 131.6 Gbps beating their
previous mark for peak
throughput of 101 Gbps by 50
percent.
Trends in Transmission
Speed II
The new record data transfer
speed is also equivalent to
serving 10,000 MPEG2 HDTV
movies simultaneously in real
time, or transmitting all of
the printed content of the
Library of Congress in 10
minutes.
Trend in Languages
Importance of scripting and
string processing
XML, Java C++, Trend towards
Python, Matlab, Mathematica
No ideal languages
No agreement of what the first
language ought to be
A recently proposed
language (Fortress 2006)
From Guy Steel, The Fortress Programming Language, Sun MicroSystemshttp://iic.harvard.edu/documents/steeleLecture2006public.pdf
Fortress Language
(Sun, Guy Steele)
Meta-level approach to
teaching
Learn 2 or 3 languages and assume that
expertise in other languages can be
acquired on the fly.
Hopefully, the same will occur in learning a
topic in depth. Once in-depth research is
taught using a particular area it can be
extrapolated to other areas.
Increasing usage of canned programs or
data banks Typical examples: GraphViz,
WordNet
Trends in Algorithmic
Complexity
Overcoming the scare of NP
problems
(it happened before with
undecidability)
3-SAT lessons
Mapping polynomial problems
within NP
Optimization, approximate or
random algorithms
Three Examples
Example I The lessons of BLAST
(preprocessing, incrementability,
approximation)
Example II The importance of analyzing
very large networks.
(probability, sensors, sociological implications)
Example III Time Series.
(data mining, pattern searches, classification)
Example I
(History of BLAST)
sequence alignment
Biologists matched sequences of
nucleotides or aminoacids
empirically using Dot Matrices
Dot matrices
No exact matching
Alignment with Gaps
Dynamic Programming
Approach
Dynamic Programming
2
complexity O(n )
Two solutions with gaps
Complexity can be exponential
for determining all solutions
The BLAST approach
complexity is almost
linear
Equivalent Dot Matrices would have
the size
3 billion columns (human genome)
and
Z rows where Z is the size of the
sequence being matched against a
genome (possibly tens of thousands)
BLAST Tricks
Preprocessing
Compile the locations in a genome
containing all possible “seeds”
(combinations of 6 nucleotides or
aminoacids)
Hacking
Follow diagonals as much as possible
(Blast strategy)
Use dynamic programming as a last
resort
Lots of approximations but a
very successful outcome
No multiple solutions
BLAST may not find best matches
The notion of p-values becomes very
important (probability of matches in
random sequences)
Tuning of the BLAST algorithm
parameters
Mixture of hacking and theory
Advantage: satisfies incrementability
Example II
(Networks and Sociology)
Money travels (bills)
Probabilities
P(time,distance)
Money travels
The entire process could be
implemented using sensors.
Mimics spread of disease.
The impact of computing will
go deeper into the sciences
and spread more into the
social sciences (Jon Kleinberg,
2006)
Example III (Time Series)
Illustrates data mining and
how much CS can help other
sciences
Slides from
Dr Eamonn Keogh
University of California.
Riverside,CA
Examples of time
series
Time Series (cont 1)
Time Series (cont 2)
Time Series (cont 3)
Time Series (cont 4)
Time Series (cont 5)
Using Logic Programming in
Multivariate Time Series (Sleep
Apnea)
from G Guimarães and L. Moniz Pereira
Eve nt2
9000
Eve nt3
Eve nt5
8000
Eve nt Ta ce t
7000
No ribca ge a nd a bdomina l
move me nts without s noring
S trong ribca ge a nd a bdomina l
move me nts
Re duce d ribca ge a nd a bdomina l
move me nts without s noring
Ta ce t
6000
5000
4000
No a irflow without s noring
3000
S trong a irflow with s noring
2000
Ta ce t
1000
Airflow
Ribca ge move me nts
04:00:00
04:00:05
04:00:10
04:00:14
04:00:19
04:00:24
04:00:28
04:00:33
04:00:38
04:00:43
04:00:48
04:00:52
04:00:58
04:01:02
04:01:07
04:01:12
04:01:16
04:01:21
04:01:26
04:01:31
04:01:36
04:01:40
04:01:45
04:01:50
04:01:55
04:02:00
04:02:04
04:02:09
04:02:14
04:02:19
04:02:24
04:02:28
04:02:33
04:02:38
04:02:43
04:02:48
04:02:53
04:02:58
04:03:02
04:03:07
04:03:12
04:03:16
04:03:21
04:03:26
04:03:31
04:03:36
04:03:40
04:03:46
04:03:50
04:03:55
04:04:00
0
Abdomina l move me nts
S noring
Back to curricula
recommendations
Present
status (USA)
and suggested
changes
Current recommended
curricula ACM, SIGCSE 2001 (USA)
1. Discrete Structures (43 core hours)
2. Programming Fundamentals (54 core hours)
3. Algorithms and Complexity (31 core hours)
4. Programming Languages (6 core hours)
5. Architecture and Organization (36 core hours)
6. Operating Systems (18 core hours)
7. Net-Centric Computing (15 core hours)
8. Human-Computer Interaction (6 core hours)
9. Graphics and Visual Computing (5 core hours)
10. Intelligent Systems (10 core hours)
11. Information Management (10 core hours)
12. Software Engineering (30 core hours)
13. Social and Professional Issues (16 core hours)
14. Computational Science (no core hours)
From Domik G.: Glimpses into the Future of Computer Science
Education University of Paderhor, Germany
Changing Curricula
Two extremes
Increased Generality and Limited Depth
Limited Generality
Depth
and Increased
The two extremes in graphical
form
Breadth
(
generality)
D
Depth
The MIT pilot program for
freshmen
At MIT there is a unified EECS
department
Two choices for the first year course:
Robotics using probabilistic
Bayesian approaches (CS)
Study of cell phones inside out (EE)
Concrete suggestions I
Teaching is inextricably linked to research.
Time and resources govern curriculum
changes.
Gradual changes are essential.
Avoid overlap of material among different
required courses.
If possible introduce an elective course on
Current trends in computer science.
Deal with massive data even in intro courses.
Concrete suggestions II
When teaching algorithms stress
the potential of:
Preprocessing
Incrementality
Parallelization
Approximations
Taking advantage of sparseness
Concrete suggestions III
Emphasize probability and
statistics
Bayesian approaches
Hidden Markov Models
Random algorithms
Clustering and classification
Machine learning and Data
Mining
Finally, …
Encourage
interdisciplinary work.
It will inspire new directions
in computer science.
Thank you!!
Future of Computer Intensive
Science in the U.S. (Daniel Reed 2006)
Ten years – a geological epoch on the computing time scale.
Looking back, a decade brought the web and consumer email,
digital cameras and music, broadband networking, multifunction
cell phones, WiFi, HDTV, telematics, multiplayer games,
electronic commerce and computational science.
It also brought spam, phishing, identity theft, software insecurity,
outsourcing and globalization, information warfare and blurred
work-life boundaries. What will a decade of technology advances
bring in communications and collaboration, sensors and
knowledge management, modeling and discovery, electronic
commerce and digital entertainment, critical infrastructure
management and security?
What will it mean for research and education?
Daniel A. Reed is the director of the Renaissance Computing Institute. He also is Chancellor's
Eminent Professor and Vice-Chancellor for Information Technology at the University of North
Carolina at Chapel Hill.
Cyberinfrastructure and Economic
Curvature Creating Curvature in a
Flat World (Singtae Kim, Purdue, 2006)
Cyberinfrastructure is central to scientific advancement in
the modern, data-intensive research environment. For
example, the recent revolution in the life sciences, including
the seminal achievement of sequencing the human genome
on an accelerated time frame, was made possible by parallel
advances in cyberinfrastructure for research in this dataintensive field.
But beyond the enablement of basic research,
cyberinfrastructure is a driver for global economic growth
despite the disruptive 'flattening' effect of IT in the
developed economies. But even at the regional level,
visionary cyber investments to create smart infrastructures
will induce 'economic curvature' a gravitational pull to
overcome the dispersive effects of the 'flat' world and the
consequential acceleration in economic growth.
Miscellaneous I
Claytronics
Game theory (economics - psychology)
Other examples in bioinformatics
Beautiful interaction between sequence
(strings) and structures
Reverse engineering
In biology Geography and Phenotype
(external structural appearance) are of
paramount importance
Systems Biology
Miscellaneous II
Cross word puzzle using Google
Skiena and statistical NLP
Science Education
Jacques Cohen
Brandeis University
Waltham, MA
USA
January 2007
Topics
Preliminary remarks
Present state of affairs and
concerns
Objectives of this talk
Trends (hardware, software, networks,
others)
Illustrative examples
Suggestions
Present state of affairs and
concerns
Huge increase in PC and internet usage.
Decreasing enrollment.
(USA mainly)
Possible Reasons
Previous high school preparation
Bubble burst (2000) + outsourcing
Widespread usage of computers
by lay persons
Interest in interdisciplinary
topics (e.g., biology, business,
economics)
Public perception about:
What is Computer Science?
The Nature of Computer
Science
Two main components:
Theoretical
and
Experimental
Mathematics and
Engineering
What characterizes CS is the notion of
Algorithms
Emphasis on the discrete and logic
An interdisciplinary approach with other
sciences may well revive the interest on
the continuous (or use of qualitative
reasoning)
Related fields
Sciences in general (scientific
computing),
Management,
Psychology (human interaction),
Business,
Communications,
Journalism,
Arts, etc.
The role of Computer
Science among other
sciences
(How we are perceived by the other sciences)
In physics, chemistry, biology, nature
is the ultimate umpire.
Discovery is paramount
In math and engineering: aesthetics,
ease of use, acceptance,
permanence, play key roles
Uneasy dialogue with
biologists
It is not unusual to hear from a
physicist, chemist or biologist:
“If computer scientists do not get
involved in our field, we will do it
ourselves!!”
It looks very likely that the
biological sciences (including, of
course, neuroscience) will
dominate the 21st century
Differences in approaches
Most scientific and creative discoveries
proceed in a bottom-up manner
Computer scientists are taught to
emphasize top-down approaches
Polya’s “How to solve it” often mentions
First specialize then generalize .
Hacking is beautiful (mostly bottom-up)
Objectives
Provide a bird’s eye view of what is
happening in CS education (USA) and
attempt to make recommendations
about possible directions. Hopefully,
some of it would be applicable to
European universities.
Premise
Changes ought to be gradual and
depend on resources and time
constraints
First we have to observe current
trends
Generality, Storage, Speed, Networks, others.
Trying to make sense of present
directions.
Difficult and risky to foresee future,
e.g., PC (windows, mouse), internet,
parallelism
Topics influencing computer
science education.
Trends in hardware, software,
networks.
Huge volume of data
(terabytes and petabytes)
Statistical nature of data
Clustering, classification
Probability and Statistics
become increasingly
important
Trend towards generality
Need to know more about what is
going on in related topics
A few examples:
Robotics and mechanical engineering
Hardware, electrical engineering,
material science, nanotechnology
Multi-field visualization (e.g., medicine)
Biophysics and bioinformatics
Nature of data structures
Sequences (strings), streams
Trees, DAGs, and Graphs
3D structures
Emphasis in discrete structures
Neglect of the continuous
should be corrected (e.g., use of
MatLab)
Trends on data growth
How Much Information Is There In the
World?
The 20-terabyte size of the Library
of Congress derived by assuming
that LC has 20 million books and
each requires 1 MB. Of course, LC
has much other stuff besides printed
text, and this other stuff would take
much more space.
From Lesk
http://www.lesk.com/mlesk/ksg97/ksg.html
Library of Congress data
(cont)
1. Thirteen million photographs, even if
compressed to a 1 MB JPG each, would be 13
terabytes.
2. The 4 million maps in the Geography Division
might scan to 200 TB.
3. LC has over five hundred thousand movies; at
1 GB each they would be 500 terabytes (most
are not full-length color features).
4. Bulkiest might be the 3.5 million sound
recordings, which at one audio CD each, would
be almost 2,000 TB.
This makes the total size of the Library perhaps
about 3 petabytes (3,000 terabytes).
How Much Information Is There In the
World?
Lesk’s Conclusions
There will be enough disk space
and tape storage in the world to
store everything people write,
say, perform or photograph. For
writing this is true already; for the
others it is only a year or two
away.
Lesk’s Conclusions (cont)
The challenge for librarians and
computer scientists is to let us find
the information we want in other
people's work; and the challenge for
the lawyers and economists is to
arrange the payment structures so
that we are encouraged to use the
work of others rather than re-create it.
The huge volume of data
implies:
Linearity of algorithms is a must
Emphasis in pattern matching
Increased preprocessing
Different levels of memory transfer
rates
Algorithmic incrementality (avoid redoing
tasks)
Need of approximate algorithms
(optimization)
Distributed computing
Centralized parallelism (Blue Gene, Argonne)
The importance of pattern
matching (searches) in large
number of items
Pattern matching has to be “tolerant” (approximate)
Find closest matches (dynamic programming,
optimization)
Sequences
Pictures
3D structures (e.g. proteins)
Sound
Photos
Video
Trends in computer cycles
(speed)
Moore’s law appears to be applicable until at
least 2020
Use of supercomputers
(2006)
Researchers at Los Alamos National
Laboratory have set a new world's record
by performing the first million-atom
computer simulation in biology. Using the
"Q Machine" supercomputer, Los Alamos
computer scientists have created a
molecular simulation of the cell's proteinmaking structure, the ribosome. The
project, simulating 2.64 million atoms
in motion, is more than six times larger
than any biological simulations performed
to date.
Graphical visualization of the
simulation of a Ribosome at
work
Network transmission
speed (Lambda Rail Net)
USA backbone
Trends in Transmission Speed
The High Energy Physics
team's demonstration
achieved a peak throughput of
151 Gbps and an official mark
of 131.6 Gbps beating their
previous mark for peak
throughput of 101 Gbps by 50
percent.
Trends in Transmission
Speed II
The new record data transfer
speed is also equivalent to
serving 10,000 MPEG2 HDTV
movies simultaneously in real
time, or transmitting all of
the printed content of the
Library of Congress in 10
minutes.
Trend in Languages
Importance of scripting and
string processing
XML, Java C++, Trend towards
Python, Matlab, Mathematica
No ideal languages
No agreement of what the first
language ought to be
A recently proposed
language (Fortress 2006)
From Guy Steel, The Fortress Programming Language, Sun MicroSystemshttp://iic.harvard.edu/documents/steeleLecture2006public.pdf
Fortress Language
(Sun, Guy Steele)
Meta-level approach to
teaching
Learn 2 or 3 languages and assume that
expertise in other languages can be
acquired on the fly.
Hopefully, the same will occur in learning a
topic in depth. Once in-depth research is
taught using a particular area it can be
extrapolated to other areas.
Increasing usage of canned programs or
data banks Typical examples: GraphViz,
WordNet
Trends in Algorithmic
Complexity
Overcoming the scare of NP
problems
(it happened before with
undecidability)
3-SAT lessons
Mapping polynomial problems
within NP
Optimization, approximate or
random algorithms
Three Examples
Example I The lessons of BLAST
(preprocessing, incrementability,
approximation)
Example II The importance of analyzing
very large networks.
(probability, sensors, sociological implications)
Example III Time Series.
(data mining, pattern searches, classification)
Example I
(History of BLAST)
sequence alignment
Biologists matched sequences of
nucleotides or aminoacids
empirically using Dot Matrices
Dot matrices
No exact matching
Alignment with Gaps
Dynamic Programming
Approach
Dynamic Programming
2
complexity O(n )
Two solutions with gaps
Complexity can be exponential
for determining all solutions
The BLAST approach
complexity is almost
linear
Equivalent Dot Matrices would have
the size
3 billion columns (human genome)
and
Z rows where Z is the size of the
sequence being matched against a
genome (possibly tens of thousands)
BLAST Tricks
Preprocessing
Compile the locations in a genome
containing all possible “seeds”
(combinations of 6 nucleotides or
aminoacids)
Hacking
Follow diagonals as much as possible
(Blast strategy)
Use dynamic programming as a last
resort
Lots of approximations but a
very successful outcome
No multiple solutions
BLAST may not find best matches
The notion of p-values becomes very
important (probability of matches in
random sequences)
Tuning of the BLAST algorithm
parameters
Mixture of hacking and theory
Advantage: satisfies incrementability
Example II
(Networks and Sociology)
Money travels (bills)
Probabilities
P(time,distance)
Money travels
The entire process could be
implemented using sensors.
Mimics spread of disease.
The impact of computing will
go deeper into the sciences
and spread more into the
social sciences (Jon Kleinberg,
2006)
Example III (Time Series)
Illustrates data mining and
how much CS can help other
sciences
Slides from
Dr Eamonn Keogh
University of California.
Riverside,CA
Examples of time
series
Time Series (cont 1)
Time Series (cont 2)
Time Series (cont 3)
Time Series (cont 4)
Time Series (cont 5)
Using Logic Programming in
Multivariate Time Series (Sleep
Apnea)
from G Guimarães and L. Moniz Pereira
Eve nt2
9000
Eve nt3
Eve nt5
8000
Eve nt Ta ce t
7000
No ribca ge a nd a bdomina l
move me nts without s noring
S trong ribca ge a nd a bdomina l
move me nts
Re duce d ribca ge a nd a bdomina l
move me nts without s noring
Ta ce t
6000
5000
4000
No a irflow without s noring
3000
S trong a irflow with s noring
2000
Ta ce t
1000
Airflow
Ribca ge move me nts
04:00:00
04:00:05
04:00:10
04:00:14
04:00:19
04:00:24
04:00:28
04:00:33
04:00:38
04:00:43
04:00:48
04:00:52
04:00:58
04:01:02
04:01:07
04:01:12
04:01:16
04:01:21
04:01:26
04:01:31
04:01:36
04:01:40
04:01:45
04:01:50
04:01:55
04:02:00
04:02:04
04:02:09
04:02:14
04:02:19
04:02:24
04:02:28
04:02:33
04:02:38
04:02:43
04:02:48
04:02:53
04:02:58
04:03:02
04:03:07
04:03:12
04:03:16
04:03:21
04:03:26
04:03:31
04:03:36
04:03:40
04:03:46
04:03:50
04:03:55
04:04:00
0
Abdomina l move me nts
S noring
Back to curricula
recommendations
Present
status (USA)
and suggested
changes
Current recommended
curricula ACM, SIGCSE 2001 (USA)
1. Discrete Structures (43 core hours)
2. Programming Fundamentals (54 core hours)
3. Algorithms and Complexity (31 core hours)
4. Programming Languages (6 core hours)
5. Architecture and Organization (36 core hours)
6. Operating Systems (18 core hours)
7. Net-Centric Computing (15 core hours)
8. Human-Computer Interaction (6 core hours)
9. Graphics and Visual Computing (5 core hours)
10. Intelligent Systems (10 core hours)
11. Information Management (10 core hours)
12. Software Engineering (30 core hours)
13. Social and Professional Issues (16 core hours)
14. Computational Science (no core hours)
From Domik G.: Glimpses into the Future of Computer Science
Education University of Paderhor, Germany
Changing Curricula
Two extremes
Increased Generality and Limited Depth
Limited Generality
Depth
and Increased
The two extremes in graphical
form
Breadth
(
generality)
D
Depth
The MIT pilot program for
freshmen
At MIT there is a unified EECS
department
Two choices for the first year course:
Robotics using probabilistic
Bayesian approaches (CS)
Study of cell phones inside out (EE)
Concrete suggestions I
Teaching is inextricably linked to research.
Time and resources govern curriculum
changes.
Gradual changes are essential.
Avoid overlap of material among different
required courses.
If possible introduce an elective course on
Current trends in computer science.
Deal with massive data even in intro courses.
Concrete suggestions II
When teaching algorithms stress
the potential of:
Preprocessing
Incrementality
Parallelization
Approximations
Taking advantage of sparseness
Concrete suggestions III
Emphasize probability and
statistics
Bayesian approaches
Hidden Markov Models
Random algorithms
Clustering and classification
Machine learning and Data
Mining
Finally, …
Encourage
interdisciplinary work.
It will inspire new directions
in computer science.
Thank you!!
Future of Computer Intensive
Science in the U.S. (Daniel Reed 2006)
Ten years – a geological epoch on the computing time scale.
Looking back, a decade brought the web and consumer email,
digital cameras and music, broadband networking, multifunction
cell phones, WiFi, HDTV, telematics, multiplayer games,
electronic commerce and computational science.
It also brought spam, phishing, identity theft, software insecurity,
outsourcing and globalization, information warfare and blurred
work-life boundaries. What will a decade of technology advances
bring in communications and collaboration, sensors and
knowledge management, modeling and discovery, electronic
commerce and digital entertainment, critical infrastructure
management and security?
What will it mean for research and education?
Daniel A. Reed is the director of the Renaissance Computing Institute. He also is Chancellor's
Eminent Professor and Vice-Chancellor for Information Technology at the University of North
Carolina at Chapel Hill.
Cyberinfrastructure and Economic
Curvature Creating Curvature in a
Flat World (Singtae Kim, Purdue, 2006)
Cyberinfrastructure is central to scientific advancement in
the modern, data-intensive research environment. For
example, the recent revolution in the life sciences, including
the seminal achievement of sequencing the human genome
on an accelerated time frame, was made possible by parallel
advances in cyberinfrastructure for research in this dataintensive field.
But beyond the enablement of basic research,
cyberinfrastructure is a driver for global economic growth
despite the disruptive 'flattening' effect of IT in the
developed economies. But even at the regional level,
visionary cyber investments to create smart infrastructures
will induce 'economic curvature' a gravitational pull to
overcome the dispersive effects of the 'flat' world and the
consequential acceleration in economic growth.
Miscellaneous I
Claytronics
Game theory (economics - psychology)
Other examples in bioinformatics
Beautiful interaction between sequence
(strings) and structures
Reverse engineering
In biology Geography and Phenotype
(external structural appearance) are of
paramount importance
Systems Biology
Miscellaneous II
Cross word puzzle using Google
Skiena and statistical NLP