Computer networking problems and solutions by Russ White, Ethan Banks
- rfc1925 rule 11: "Every old idea will be proposed again with a different name and a different presentation, regardless of whether it works."
- rfc1925
- to marshal - assemble and arrange (a group of packets or people, especially troops) in order
1
- circuit vs packets switched networks
- TDM = Time Division Multiplexing
- Frame Relay, SONET, ISDN, and X.25 are examples of circuit switched technology
- control plane is the set of protocols and processes that build the information necessary for the network devices to forward traffic through the network
- data plane (also known as the forwarding plane) is the path of information through the network
- management plane is focused on managing the network devices, including monitoring the available memory, monitoring queue depth, and monitoring when the device drops the information being transmitted through the network, etc
- distance vector protocols - calculate loop-free paths hop by hop based on the path cost
- link state protocols - calculate loop-free paths across a database synchronized across the network devices
- path vector protocols - calculate loop-free paths hop by hop based on a record of previous hops
2
- multiplexing - placing multiple bits on the same medium at once or allowing multiple hosts to communicate using the same medium at once
- ships in the night solution; the old and new protocols (or versions of the same protocol) do not interact at all
- flag day - a day (and potentially a time, down to the millisecond in some cases) to switch from the old protocol to the new
- TLV = Type Length Value
- CRC = Cyclic Redundancy Check
- FEC = Forward Error Correction
3
- RINA = Recursive Internet Architecture
- four functions any data-carrying protocol can serve: transport, multiplexing, error correction, and flow control
4
- bandwidth, throughput, goodput
5
- QUIC = Quick UDP Internet Connections
- all nontrivial abstractions leak
6
- If information about the location of a service has been cached from a prior request, it is returned as a nonauthoritative answer; if the actual server configured to hold the information about a domain replies, its answer is authoritative.
- DUID = DHCP Unique IDentifier
7
- MLAG = Multichassis Link AGreggation - aggregating multiple physical switches into one logical
- NPU = Network Processing Unit (GPU for network stuff) - subset of ASICs
8
- The public Internet is a best effort transport. There are no guarantees of even traffic delivery, let alone traffic prioritization.
- UDP = Unacknowledged Datagram Protocol
- controlled delay, or CoDel. CoDel assumes an oversized buffer but manages packet delay by monitoring how long a packet has been in the queue. This is known as the sojourn time. When the packet sojourn time has exceeded the computed ideal, the packet is dropped. This means packets at the head of the line—those that have waited the longest—are going to be dropped before packets currently at the tail end of the queue.
9
- SR = Segment Routing - pop, push, continue labels
- SRLG = Shared Risk Link Group
10
- nonce - randomly selected series of numbers (coined for one occasion)
11
- MTU size itself can have a large impact on the performance of a control plane in terms of its speed of convergence
12
- SPT (Shortest Path Tree) describes the shortest path to each destination in the network, independent of the total cost of the graph
- MST (Minimum Spanning Tree) is a tree that visits each node in the network with the minimum overall cost (normally measured as the sum of all the links chosen in the network)
13
- Dijkstra was a theoretical physicist
14
- serialization - converting packets/frames into bits
- CAP theorem - Consistency, Accessibility, and Partition tolerance
15
- BUM traffic - Broadcast, Unknown, Multicast
16
- IS = Intermediate System (router); ES = End System (host)
- LSP = Link State Packet; CSNP = Complete Sequence Number Packet; PSNP = Partial Sequence Number Packet; DIS = Designated IS
17
- Any flow taking up more than around 20% of the available bandwidth of a single link and persisting for more than two or three minutes might, for instance, be classified as an elephant flow
- Mouse flows, on the other hand, are much lower bandwidth, say less than 1% of the available bandwidth on any link, and tend to last for very short periods of time
18
- fibbing - inserting false nodes, similar to pseudonodes, into the link state database, causing OSPF and IS-IS to change the shortest path, and hence engineering traffic flows through the network
19
- i need to understand more math/physics
20
- flooding domain is a set of routers with completely synchronized databases
- EVPN = Ethernet VPN
21
- OODA = Observe, Orient, Decide Act
22
- When you get 16 designers in a room, what you will have is 1 person drawing on the white board, and the other 15 erasing.
23
- MTBF = Mean Time Between Failures
- MTTR = Mean Time to Repair
- Five 9s of availability means the network is available 99.999% of the time, or is not operational for about 5.2 minutes each year.
24
- models are a two-edged sword: they present a more readily understandable version of a system, but they also present a necessarily incomplete version of a system
- you cannot know what “broken” looks like unless you know what “normal” looks like
- half split method
- technical debt - doing something that will either cause fixing a problem in the future to be more complex or will result in a similar failure mode happening in the future
- A temporary fix incurs technical debt; a permanent fix either reduces technical debt or leaves it constant.
25
- Traffic being carried to and from servers from outside the data center is called north/ south traffic, as it is traveling between the top and bottom of the network diagram as traditionally drawn.
- This data is called east/west traffic, as it is flowing from one device connected to the data center network to another.
- ToR = Top of Rack
26
- SOS model - State, Optimization, Surface
- IaC = Infrastructure as Code
27
- NFV = Network Function Virtualization
- VNF = Virtualized Network Function
28
- Each feature available in a network device, or a cloud-based service, represents some amount of code—code that must interact with the code providing other configured, in-use features. These features, and the code they represent, are perfect gateways into failures through unintended consequences, potential security holes in waiting, and a larger attack surface.
- Some cloud providers have, in the past, used a partnership with a customer to learn how to build and support a particular business model, and then used the experience to enter the market as a direct competitor to their own customer. Providing services for unique businesses can be a great incubation strategy for cloud providers to spin up internal analogs to the customers they are supporting, eventually broadening their market reach.
- QoE = Quality of Experience
29
- BCP = Best Current Practice
30
- rule 11 rephrasing: By learning what is old, you can learn what will be proposed as new in the future.