- context switch and interrupts/exceptions
    - exception: processor runs the OS [kernel] for some reason

    - context switch: the OS unloads the state for one process/thread
        and loads the state for a different process/thread

    - the only connection between excetpions and context switches is that
        the OS needs to be running for a context switch to happen

    - so if we see process A was running and then process B was running
        that means something needed to run the OS for a little bit of time
        in between
        - that something is some kind of exception
        - if process A didn't do anything to ask the OS to run, it
            must have been triggered by an event external to process A
                like --- input from somewhere
                     --- device is ready for output
                     --- timer [setup previously] timer expiring
        - process A might ask to run not to expliictly switch to another program,
            but the OS might switch anyway
                example: it asks for keyboard input,
                the OS might decide to switch away because there's no input right now

    - timer exceptions:
        OS wants to make sure as a last resort that it runs if a program takes too long
        solution: the hardware supplies the OS a timer
            - before running process A, the OS sets the timer for (as an example) 1 ms
            - if the OS doesn't reset the timer later, the processor will trigger an exception
                in 1 ms and run the OS regardless of what the program was doing

    - I/O and exceptions
        - when devices need attention from the OS, they can ask the processor to run the OS
        - when do they need attention?
            when they have input ready for the OS to do something with
            when the OS might want to send output and the yare now ready to take that output
                (but weren't before)

        - NB: not everytime a progrma accesses an I/O device --- just when the OS wouldn't otherwise
            be trying to handle that I/O


- F2024 Q6 signals
    - chat program like the signals lab
    - we send a message, what exceptions happen
    
    before we send the message:
        both programs have setup a shared memory region
        both programs have setup a signal handler
        both programs are waiting for terminal input
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^--- with a system call

    almost certainly the OS is running something that is not these programs
        (b/c they can't do anything until they get input)

    I/O exception to tell the OS there's input available for tihs program
        [option (depending on how much the OS likes switching away from programs)
            possibly, the OS might get the input, but not run the chat program right away
            if so, we might need a timer interrupt before the OS switches
        ]
    then the chat program will get the input (return value of a system call)
    then the chat program will store the input in the shared memory
        doesn't require any exceptions assuming the memory is already in our page table
    then the chat program will send a signal
        requires a system call
    the OS will eventually run the signal hnadler
        could do this immediately during the system call to the send th signal
        or could do it later in respones to some other exception
    then the other chat program will output to the terminal
        system call

- cache write policies [write-back, write-allocate]
    - let's say the processor requests to write 2 bytes to 0x104 with value 0x1234
        and the cahce block contianing 0x104 ranges from addresses 0x100 through 0x10F inclusive

    - the cache's flowchart:
        * lookup address 0x104 in the cache?
            if it's NOT present, then
                [IF A WRITE-NO-ALLOCATE CACHE] --- STOP! just send the write to the next level
                [IF A WRITE-ALLOCATE CACHE] --- add it to the cache
                    if adding it to the cache, we need to read
                            0x100-0x103 and 0x106-0x10F inclusive from the next level
                                problem: if the process reads from 0x100, we're going to think that value is correctly cached
                                    if we don't read it, we can't give processsor back the right value

        * modify offsets 4 and 5 of the cache block to contain 0x1234

        * [IF A WRITE-THROUGH CACHE] --- send the write of 0x1234 to bytes 0x104-0x105 to the next level
            (write-through cache ---> no dirty bits, we CANNOT mark the block as dirty)
        * [IF A WRITE-BACK CACHE] --- mark the block as dirty
    
    - [IF A WRITE-BACK CACHE] when we REPLACE something in the cache (regardless of read or write)
        if it's dirty, we have to write to the next level
            - we have to write the WHOLE block, because we don't track more detial about what changed

 
- TLB structure
                                             vvvvvvvvvvvvvvvvvvvvvvvvvvvvv--------permission bits + phyiscal page number
    TLB mapping [virtual page numbers]----->[last-level page table entries]
                 ^^^^^^^^^^^^^^^^^^^^-- everything in a virtual address that's not the page offset

    TLB is a cache, organized like our memory caches, but:
        * instead of using an address as input we use the virtual page number
        * when we split the input [VPN] into tag/index/offset, we have 0 offset bits
                        TLB index
                       vvvvvvvvv
            [virtual page number]
             ^^^^^^^^^^
             TLB TAG

            * use the TLB inde to find a set
            * look for a matching tag in the set, if so --- hit and the block contains our PTE
                                                 if not --- miss, and we load the correct PTE into the TLB
        * because we have 0 offset bits, we store one last-level page table entry per block

    on a TLB hit:

        [virtual page number][page offset]
         ~~~~~~~~~~~~~~~~~~~  %%%%%%%%%%%
            |                       |
          TLB                       |
            v                       v
          [physical page num][page offset]
    
        (plus we check the permission bits retrieved from the TLB)

- threads and heap/stack/etc.
    - when one process has multiple threads, they all have the same page table
        - any pointer in one thread another thread could use
    - each thread in a process has a different stack, with a different address
        - new "empty" stack created by pthread_create()
        - usually passing pointers through pthread_create() and pthread_create() won't copy antthing the pointer points to
    - each thread in a process can pass pointers to each other and dereference a pointer
        created by another thread
        -- BUT if the pointer points to something that went out of scpoe, it won't work
                -- most commonly comes up with things on another thread's stack
        -- usually not a problem to have pointers to global variables/heap
            b/c those won't go out of scope by accident

- broadcast v signal for condition variables
    - condition variables = list of waiting threads
    - broadcast = let all of them run and see if they can do something
    - signal    = let one of them run  "    "       "   "   "   "   "
        - both do nothing if no one is waiting
            [but that shouldn't be a problem because we always check if we need to wait
            before waiting,and do it with a lock to make sure whether we need to wait
            does not change while we're checking]

    - to use pthread_cond_wait correctly, we always need to double-check the condition:
        while (still need to wait)
            pthread_cond_wait(...);

    - this means that it's "safe" to broadcast from a correctness point of view
        - worst that can happen, is bunch threads realize they still need to wait
            when they double-check the condition

    - ideally, we'd like to not wakeup threads that will just statring waiting again
        SOMETIMES we can use signal instead of broadcast as part of doing this:
        - we have to know that at most one thread can go as a result of our action
        - example: if we placed one item on a queue and we signal each time to threads that
            will remove one item
            - we know in tihs case, we're always waking up *enough* threads to handle the one item
                [what we need to know for correctness]
            - it might be the case that thread we wake up doesn't get to go because another
              thread snuck in before it  (and was neve waiting)
            - but in typical case, we'll wake up exactly the rgith number of waiting threads

- UDP v TCP
    - "transport" layer protocols
    - transport layer has two responsibilities that maybe should've been separated (but aren't on the Internet)
        - organizing data into streams or similar [even though it might not be sent that way] that are potentialy reliable
            - TCP does this ~~ implements reiable transmission
            - UDP does NOT do this
        - getting data to the correct program on machines (when there might be multiple)
            - TCP and UDP do this

    - why would you choose UDP over TCP?
        - if you don't care about reliable transmission (and want lower overhead/latency)
            - example: for a video stream, you might prefer to NOT resend old data that was lost
                and instead get the most recent video frames
        - if you think you can do better than TCP at implementing reliable stream tranmission
            - example: QUIC (used by some web browsers)

- certificates/certificate authorities -- why useful
    - we can communicate securly with a machine if we have its public keys
    - problem: how do we get public keys?

    - part 1 of the modern solution to this:
        - we ship a list of public keys with OSes and browsers
        - BUT:
            - we can't include all the public keys for every machine we want to talk to
            - we cna't change this list public keys as fast as the public keys of machies we want to talk change

    - part 2 of the modern solution:
        - the public keys we ship are for certificate authorities
        - those public keys are used to verify (digital signature) that those cetrificate authorities
            are sending us another public key
            - for this to make sense, we need to trust the certificate authority to verify other public keys
                [open quesiton: are they actually trustworthy?]
        - rather than having the certificate authorities send the public key directly,
            the certificates autohrities generate a message for foo.com saying
                "I, this Certificate Authotity, think foo.com's public key is X"
            and foo.com fowards that message to the browser/etc.
        - that message comes with a digital signature so we can verify it actually comes from the certificate authority
        - message is called a "certificate"

    - part 3 of the modern solution:
        - it's not practical to have certificate authorities directly sign the messages for foo.coms,
            so they delegate it
        - we have use another message from the certificate authority so browsers can know that it's
            delegated to a "sub-certificate authority".
        - result: "certificate chains"
            - browser is sent:
                - certificate 1: "Certificate Authority A delegates to Certificate Authority B with public key Y"
                    [signed by Certificate Authority A's private key]
                - certificate 2: "Certificate Authority B thinks foo.com's public key is X"
                    [signed by Certificate Authority B's private key = Y]

- branch prediction --- when does squashing happen
    - pipelined processor:
        we keep running the incorrect prediction until we compute the actual result of the jump

        in the pipeline HW
                  v--------------- first time we can fetch based on what we computed
                v----------------- computed where the jump should really go 
            F D E M W           jmp
              F D               prediction
                F               after prediction
                  F             finally fetching based on what was actually computed
                                    [b/c this was the first time we had the information about
                                        where to go in time to fetch something]

- Q1 on this week's quiz [Sidechannel]
    function that checks whether its input mathces some rotated version of a hidden_string
    we learned that it's true for ABCDEFGH
    hidden_string is
        ABCDEFGH OR BCDEFGHA OR ...

    the functions works by trying rotations until one matches then returns true,
        so we can figure out what rotation is actually in hidden_string by seeing which is fastest to return true
        should have the shortest time from the correct rotation being tested first

        
- Q2 on this week's quiz [Spectre]
    Prime()
    *pointer += ...;
    Probe()

    and found out taht array index 512 was slow in Probe()
    and array index 512 was 0x14000000 + 512 --> cache set index 2
        (b/c the cache has 256-byte blocks)

    so the correct answers are:
        addresses in cache set 2 that are not in the array
            cache set 2 = could evict array[512]
            not in array = needed to added to the cache


- Q3 on this week's quiz [Spectre]

    array[mystery * 2]
    array[mystery * 3]

    accessed cache set indices 514, 515, 739
                                        cache set 6xx
                                        vvvvvvvvvvvvvvvvvvvvvvv
    array was address 0x10004000 through __ + 1000 * sizeof(int)
                     ^^^^^^^^^^^--
                        cache set index 512

    739 must be outside the array (probably mystery)

    array[0 to 3] = cache set 512
    array[4 to 7] = cache set 513
    ...

    we can use this to figure out the ranges that cache set 514 + 515 access could have come from
        we know that the 514 acces was mystery * 2
                     and 515 "      "  mystery * 3


- Q4 on this week's quiz [Spectre]

    int GetLocalPortFor(int file_number)  {
        if (file_number < 0 || file_number >= 4096) {
            return -1;
        } else if (!all_files[file_number].is_socket) {
            return -1;
        } else {
            // IN ATTACK: the processor will branch predict that this should run
                // (but it will be wrong)
            int socket_index = all_files[file_number].socket_index;
                            // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ --- the address will be computed based on this
                                // IN ATTACK: all_files[file_number].socket_index wil be out of bounds
                                    // and chosen by attacker to match some value they want ot l earn about
            struct SocketInio *socket_info = &all_sockets[socket_index];
            return socket_info->local_port;
                // ^^^^^^^^^^^^^^^^^^^^^^^--- we will learn about the address of this cache access
        }
    }