Exam 3 Topics

Image classification, NLP, time series, ethics How would you use this side Computer vision & everything past

Graphs

Protein Folding Problem

Molecular structure Look at a protein and figure out how it interacts with other proteins

Giant social graph

Computer Networks

Network graphs Find unusual traffic Where to shorten things

Graph Problems

Propose new connections
Find shortest route
Organize compute graphs
Analyze graph health
Predict new links
Locate communities
- Get a famous person/the most popular person in that community to promote it in the community
Find bots

Graph Level Tasks

Describe properties of a graph
Predict how graph will behave
Find clusters/structure within the graph
Describe change over time In a graph-level task, our goal is to predict the property of an entire graph - for example, for a molecule represented as a graph, we might want to predict what the molecule smells like

Node Level Tasks

Classify nodes based on community A classic example of a node-level prediction problem is Zach’s karate club - single social network graph made up of individuals who have sworn allegiance to one of two karate instructors, a feud between Mr. Hi and John H creates a schism in the karate club, prediction problem is to classify whether a given member becomes loyal to either Mr. Hi or John H after the feud

Edge Level Tasks

Predict when an edge will form
Identify anomalous edges

Graph Terminology

In degree, out degree Temporal

Edges can be added over time
Nodes and edges can be added over time
Nodes and edges can be added and removed over time Multigraphs, hypergraphs

Featurizing the graph

Using the matrix itself doesn’t work because there are tons of adjacency matrices for a certain graph

Graphlet-based featurization

look at little structures, count up how many you see
kinda like one-hot encoding or histogram
problem: loses a lot of structure and important informatio

Simple features - graph

create features that are much simpleier
Persistence: the smallest number of links whose removal increases the diameter or disconnect the graph
Diameter: longest shortest path
- longest path I can make between two nodes in the graph that is still the shortest path between them
- The shortest path between the two furthest points on the graph (e.g. on campus)
- e.g. the two people who are the furthest possibly apart - six degrees of separation
- Will probably go through the most connected person in the world in the middle!
Average in degree/out degree
Nodes/edges added or deleted per time T
Distribution of degree over the network

Node-level features

In degree/out degree
Betweenness: number of shortest paths through that node
- High betweenness: people who interact with multiple disparate communities
Degree centrality: degree
Closeness centrality: average distance from every node, from this node
- How near the center they are
Eigenvector centrality

Link prediction & Anomaly detection

We would like to predict when a link forms in a graph

Recommend friends on Facebook
Figure out who will be new terrorists
Predict where money/resources will flow Measure how unexpected a link is
Fraudulent transactions
Suspicious network traffic Turn this into a machine learning problem

Local Similarity Based Measures

Common Neighbors: How many neighbors the nodes have in common
Jaccard Coefficient: $$ J(A,B)=\frac{|A \cap B|}{|A \cup B|}

- Adamic/Adar - Number of Length 3 paths between them MOST LINKS DON'T FORM! ### Node Embeddings Turn nodes into vectors What should it mean for two nodes to be closer together? If they're in the same social neighborhood, they should be in the same shared vector embedding space neighborhood -> NETWORK HOMOPHILY We want nodes that are in similar neighborhoods to be closer in the latent space. Nodes that fill similar roles should be closer together in the graph -> STRUCTURAL EQUIVALENCE e.g. we keep all the "popular kids" and all the "loners" together -> Random Walks A random walk is where you randomly walk around Do a big random walk, blank out one of the nodes, then train the neural network to predict it - learns to fill in valid paths in the graph - Once I do this $\forall$ nodes, what I actually have is a set of embeddings for every single node - Learn weights -> decent vector embedding #### Improvements Let's introduce two hyperparameters $p$ - return parameter, controls the likelihood of immediately returning to a node we just visited in a walk $q$ - the in-out parameter, controls how likely we are to stay in the neighborhood of node $u$, or are we more likely to visit others If we set up $p$ and $q$ in a BFS behavior, we'll end p with a structural-equivalent embedding DFS - going to learn local pathways ## Community Detection Algorithm Invented at UMD ### Network Topology - Understand where weak points in a network lie - Bridges: connecting two different communities are weak ties - An edge is a bridge if its removal results in disconnection of its terminal vertices. - Bridges - rare in real-life networks - Idea: relax the definition by checking if the distance between two terminal vertices increases if edge is removed - The larger the distance, the weaker the tie is - Community - things that have bridges between them ### Algorithm Remove as many bridges as possible, decompose the graph into a hierarchy of communities - Delete edges that will cause the shortest paths to increase - Keep doing that until I get disconnected components - First set: first of the hierarchy - Then look inside each community of disconnected components - Delete all of the bridges - most shortest paths - Keep deleting until we get disconnected components - Recursively rinse and repeat - Individual people -> base case Neighborhood overlap:

overlap(v_{i},v_{j})= \frac{|N_{i}\cap N_{j}|}{|N_{i}\cup N_{j}|-2}

## Graph Convolutional Networks

Davy's Notes

Explorer

5-6-2025

Exam 3 Topics

Graphs

Protein Folding Problem

Computer Networks

Graph Problems

Graph Level Tasks

Node Level Tasks

Edge Level Tasks

Graph Terminology

Featurizing the graph

Simple features - graph

Node-level features

Link prediction & Anomaly detection

Local Similarity Based Measures

Graph View

Table of Contents

Davy's Notes

Explorer

5-6-2025

Exam 3 Topics

Graphs

Protein Folding Problem

Social Networks

Computer Networks

Graph Problems

Graph Level Tasks

Node Level Tasks

Edge Level Tasks

Graph Terminology

Featurizing the graph

Simple features - graph

Node-level features

Link prediction & Anomaly detection

Local Similarity Based Measures

Graph View

Table of Contents