There are bound to be situations in which this isn’t enough, such as when you want to read in a large amount of text from a file. Using the OpenAI API allows you to send many more tokens in a messages array, with the maximum number depending on your chosen model. This lets you provide large amounts of text to ChatGPT using chunking. Here’s how.
The gpt-4
model currently has a maximum content length token limit of 8,192 tokens. (Here are the docs containing current limits for all the models.) Remember that you can first apply text preprocessing techniques to reduce your input size – in my previous post I achieved a 28% size reduction without losing meaning with just a little tokenization and pruning.
When this isn’t enough to fit your message within the maximum message token limit, you can take a general programmatic approach that sends your input in message chunks. The goal is to divide your text into sections that each fit within the model’s token limit. The general idea is to:
Each chunk is sent as a separate message in the conversation thread.
You send your chunks to ChatGPT using the OpenAI library’s ChatCompletion
. ChatGPT returns individual responses for each message, so you may want to process these by:
\n
with line breaks.Using the OpenAI API, you can send multiple messages to ChatGPT and ask it to wait for you to provide all of the data before answering your prompt. Being a language model, you can provide these instructions to ChatGPT in plain language. Here’s a suggested script:
Prompt: Summarize the following text for me
To provide the context for the above prompt, I will send you text in parts. When I am finished, I will tell you “ALL PARTS SENT”. Do not answer until you have received all the parts.
I created a Python module, chatgptmax
, that puts all this together. It breaks up a large amount of text by a given maximum token length and sends it in chunks to ChatGPT.
You can install it with pip install chatgptmax
, but here’s the juicy part:
import os
import openai
import tiktoken
# Set up your OpenAI API key
# Load your API key from an environment variable or secret management service
openai.api_key = os.getenv("OPENAI_API_KEY")
def send(
prompt=None,
text_data=None,
chat_model="gpt-3.5-turbo",
model_token_limit=8192,
max_tokens=2500,
):
"""
Send the prompt at the start of the conversation and then send chunks of text_data to ChatGPT via the OpenAI API.
If the text_data is too long, it splits it into chunks and sends each chunk separately.
Args:
- prompt (str, optional): The prompt to guide the model's response.
- text_data (str, optional): Additional text data to be included.
- max_tokens (int, optional): Maximum tokens for each API call. Default is 2500.
Returns:
- list or str: A list of model's responses for each chunk or an error message.
"""
# Check if the necessary arguments are provided
if not prompt:
return "Error: Prompt is missing. Please provide a prompt."
if not text_data:
return "Error: Text data is missing. Please provide some text data."
# Initialize the tokenizer
tokenizer = tiktoken.encoding_for_model(chat_model)
# Encode the text_data into token integers
token_integers = tokenizer.encode(text_data)
# Split the token integers into chunks based on max_tokens
chunk_size = max_tokens - len(tokenizer.encode(prompt))
chunks = [
token_integers[i : i + chunk_size]
for i in range(0, len(token_integers), chunk_size)
]
# Decode token chunks back to strings
chunks = [tokenizer.decode(chunk) for chunk in chunks]
responses = []
messages = [
{"role": "user", "content": prompt},
{
"role": "user",
"content": "To provide the context for the above prompt, I will send you text in parts. When I am finished, I will tell you 'ALL PARTS SENT'. Do not answer until you have received all the parts.",
},
]
for chunk in chunks:
messages.append({"role": "user", "content": chunk})
# Check if total tokens exceed the model's limit and remove oldest chunks if necessary
while (
sum(len(tokenizer.encode(msg["content"])) for msg in messages)
> model_token_limit
):
messages.pop(1) # Remove the oldest chunk
response = openai.ChatCompletion.create(model=chat_model, messages=messages)
chatgpt_response = response.choices[0].message["content"].strip()
responses.append(chatgpt_response)
# Add the final "ALL PARTS SENT" message
messages.append({"role": "user", "content": "ALL PARTS SENT"})
response = openai.ChatCompletion.create(model=chat_model, messages=messages)
final_response = response.choices[0].message["content"].strip()
responses.append(final_response)
return responses
Here’s an example of how you can use this module with text data read from a file. (chatgptmax
also provides a convenience method for getting text from a file.)
# First, import the necessary modules and the function
import os
from chatgptmax import send
# Define a function to read the content of a file
def read_file_content(file_path):
with open(file_path, 'r', encoding='utf-8') as file:
return file.read()
# Use the function
if __name__ == "__main__":
# Specify the path to your file
file_path = "path_to_your_file.txt"
# Read the content of the file
file_content = read_file_content(file_path)
# Define your prompt
prompt_text = "Summarize the following text for me:"
# Send the file content to ChatGPT
responses = send(prompt=prompt_text, text_data=file_content)
# Print the responses
for response in responses:
print(response)
While the module is designed to handle most standard use cases, there are potential pitfalls to be aware of:
As with any process, there’s always room for improvement. Here are a couple of ways you might optimize the module’s chunking and sending process further:
32k
models or need to use small chunk sizes, however, parallelism gains are likely to be minimal.If you found your way here via search, you probably already have a use case in mind. Here are some other (startup) ideas:
Do you have a use case I didn’t list? Let me know about it! In the meantime, have fun sending lots of text to ChatGPT.
]]>Text preprocessing can help shorten and refine your input, ensuring that ChatGPT can grasp the essence without getting overwhelmed. In this article, we’ll explore these techniques, understand their importance, and see how they make your interactions with tools like ChatGPT more reliable and productive.
Text preprocessing prepares raw text data for analysis by NLP models. Generally, it distills everyday text (like full sentences) to make it more manageable or concise and meaningful. Techniques include:
While all these techniques can help reduce the size of raw text data, some of these techniques are easier to apply to general use cases than others. Let’s examine how text preprocessing can help us send a large amount of text to ChatGPT.
In the realm of Natural Language Processing (NLP), a token is the basic unit of text that a system reads. At its simplest, you can think of a token as a word, but depending on the language and the specific tokenization method used, a token can represent a word, part of a word, or even multiple words.
While in English we often equate tokens with words, in NLP, the concept is broader. A token can be as short as a single character or as long as a word. For example, with word tokenization, the sentence “Unicode characters such as emojis are not indivisible. ✂️” can be broken down into tokens like this: [“Unicode”, “characters”, “such”, “as”, “emojis”, “are”, “not”, “indivisible”, “.”, “✂️”]
In another form called Byte-Pair Encoding (BPE), the same sentence is tokenized as: [“Un”, “ic”, “ode”, " characters", " such", " as", " em, “oj”, “is”, " are", " not", " ind", “iv”, “isible”, “.”, " �", “�️”]. The emoji itself is split into tokens containing its underlying bytes.
Depending on the ChatGPT model chosen, your text input size is restricted by tokens. Here are the docs containing current limits. BPE is used by ChatGPT to determine token count, and we’ll discuss it more thoroughly later. First, we can programmatically apply some preprocessing techniques to reduce our text input size and use fewer tokens.
For a general approach that can be applied programmatically, pruning is a suitable preprocessing technique. One form is stop word removal, or removing common words that might not add significant meaning in certain contexts. For example, consider the sentence:
“I always enjoy having pizza with my friends on weekends.”
Stop words are often words that don’t carry significant meaning on their own in a given context. In this sentence, words like “I”, “always”, “enjoy”, “having”, “with”, “my”, “on” are considered stop words.
After removing the stop words, the sentence becomes:
“pizza friends weekends.”
Now, the sentence is distilled to its key components, highlighting the main subject (pizza) and the associated context (friends and weekends). If you find yourself wishing you could convince people to do this in real life (coughmeetingscough)… you aren’t alone.
Stop word removal is straightforward to apply programmatically: given a list of stop words, examine some text input to see if it contains any of the stop words on your list. If it does, remove them, then return the altered text.
def clean_stopwords(text: str) -> str:
stopwords = ["a", "an", "and", "at", "but", "how", "in", "is", "on", "or", "the", "to", "what", "will"]
tokens = text.split()
clean_tokens = [t for t in tokens if not t in stopwords]
return " ".join(clean_tokens)
To see how effective stop word removal can be, I took the entire text of my Tech Leader Docs newsletter (17,230 words consisting of 104,892 characters) and processed it using the above function. How effective was it? The resulting text contained 89,337 characters, which is about a 15% reduction in size.
Other pruning techniques can also be applied programmatically. Removing punctuation, numbers, HTML tags, URLs and email addresses, or non-alphabetical characters are all valid pruning techniques that can be straightforward to apply. Here is a function that does just that:
import re
def clean_text(text):
# Remove URLs
text = re.sub(r'http\S+', '', text)
# Remove email addresses
text = re.sub(r'\S+@\S+', '', text)
# Remove everything that's not a letter (a-z, A-Z)
text = re.sub(r'[^a-zA-Z\s]', '', text)
# Remove whitespace, tabs, and new lines
text = ''.join(text.split())
return text
What measure of length reduction might we be able to get from this additional processing? Applying these techniques to the remaining characters of Tech Leader Docs results in just 75,217 characters; an overall reduction of about 28% from the original text.
More opinionated pruning, such as removing short words or specific words or phrases, can be tailored to a specific use case. These don’t lend themselves well to general functions, however.
Now that you have some text processing techniques in your toolkit, let’s look at how a reduction in characters translates to fewer tokens used when it comes to ChatGPT. To understand this, we’ll examine Byte-Pair Encoding.
Byte-Pair Encoding (BPE) is a subword tokenization method. It was originally introduced for data compression but has since been adapted for tokenization in NLP tasks. It allows representing common words as tokens and splits more rare words into subword units. This enables a balance between character-level and word-level tokenization.
Let’s make that more concrete. Imagine you have a big box of LEGO bricks, and each brick represents a single letter or character. You’re tasked with building words using these LEGO bricks. At first, you might start by connecting individual bricks to form words. But over time, you notice that certain combinations of bricks (or characters) keep appearing together frequently, like “th” in “the” or “ing” in “running.”
BPE is like a smart LEGO-building buddy who suggests, “Hey, since ’th’ and ‘ing’ keep appearing together a lot, why don’t we glue them together and treat them as a single piece?” This way, the next time you want to build a word with “the” or “running,” you can use these glued-together pieces, making the process faster and more efficient.
Colloquially, the BPE algorithm looks like this:
BPE is a particularly powerful tokenization method, especially when dealing with diverse and extensive vocabularies. Here’s why:
In essence, BPE strikes a balance, offering the granularity of character-level tokenization and the context-awareness of word-level tokenization. This hybrid approach ensures that NLP models like ChatGPT can understand a wide range of texts while maintaining computational efficiency.
At time of writing, a message to ChatGPT via its web interface has a maximum token length of 4,096 tokens. If we assume the prior mentioned percent reduction as an average, this means you could reduce text of up to 5,712 tokens down to the appropriate size with just text preprocessing.
What about when this isn’t enough? Beyond text preprocessing, larger input can be sent in chunks using the OpenAI API. In my next post, I’ll show you how to build a Python module that does exactly that.
]]>In a frenzy of accomplishment you drag it into the house – only to discover that your dining room doorway is several inches too small. It doesn’t fit.
Art only imitates life, so you may say that this comedic example is unrealistic. Of course an experienced DIY-er would have measured the doorway first. In real life, however, unforeseen hindrances rarely come in ones: once you get the table in the door, you discover the floor’s uneven. Perhaps the chairs you’ve chosen are a few inches too short… and so on.
Far from attempting to persuade you away from your next DIY project, I’d like to help make those and any other projects you take on go even smoother. The same patterns are found in furniture-building as in software development: it’s always better to build in context.
Few software developers are accurate when it comes to time and cost estimates. This isn’t a failing of software engineers, but a human tendency to towards optimism when it comes to predicting your own future. First proposed by Daniel Kahneman and Amos Tversky in 1979, the planning fallacy is no new concept.
In one study, students were asked to estimate how long they would take to finish their senior theses. The estimates, an average 27.4 days at the optimistic end and 48.6 days at the pessimistic end, came up predictably short. The average actual completion time was 55.5 days.
The study proposed two main hypotheses as to why this happens: first, that people tend to focus on their future plans rather than their past experiences; and second, people don’t tend to think that past experiences matter all that much to the future anyway.
You can probably find examples of this in your own life without trying too hard, perhaps couched in the infamous “only because” envelope. Sure, that last “weekend project” turned into a two-week affair, but that was only because you had to go run some unexpected errands. Or maybe you didn’t finish that blog post when you mean to, but that’s only because your siblings dropped in to visit. You’re absolutely, positively, definitely certain that next time would be different.
In reality, people are just plain poor at factoring in the unexpected daily demands of life. This makes sense from a certain perspective: if we were good at it, we’d probably have a lot more to fret about on a daily basis. Some measure of ignorance can make life a little more blissful.
That said, some measure of accurate planning is also necessary for success. One way we can improve accuracy is to work in context as much as possible.
Let’s consider the dining room table story again. Instead of spending months out in the garage, what would you do differently to build in context?
You might say, “Build it in the dining room!” While that would certainly be ideal for context, both in homes and in software development, it’s rarely possible (or palatable). Instead, you can do the next best thing: start building, and make frequent visits to context.
Having decided you’d like to build a table, one of the first questions is likely, “How big will it be?” You’ll undoubtedly have some requirements to fulfill (must seat six, must match other furniture, must hold the weight of your annual twenty-eight-course Christmas feast, etc.) that will lead you to a rough decision.
With a size in mind, you can then build a mock up. At this point, the specific materials, style, and color don’t matter – only its three dimensions. Once you have your mock table, you now have the ability to make your first trip to the context in which you hope it will ultimately live. Attempting to carry your foam/wood/cardboard/balloon animal mock up into the dining room is highly likely to reveal a number of issues, and possibly new opportunities as well. Perhaps, though you’d never have thought it, a modern abstractly-shaped dining table would better compliment the space and requirements. (It worked for the Jetsons.) You can then take this into account in your next higher-fidelity iteration.
This process translates directly to software development, minus the Christmas feast. You may have already recognized the MVP approach; however, even here, putting the MVP in context is a step that’s frequently omitted.
Where will your product ultimately live? How will it be accessed? Building your MVP and attempting to deploy it is sure to help uncover lots of little hiccups at an early stage.
Even when teams have prior experience with stacks or technologies, remember the planning fallacy. People have a natural tendency to discount past evidence to the point of forgetting (memory bias). It’s also highly unlikely that the same exact team is building the same exact product as the last time. The language, technology, framework, and infrastructure have likely changed in at least some small way – as have the capabilities and appetites of the engineers on your team. Frequent visits to context can help you run into any issues early on, adapt to them, and create a short feedback loop.
The specific meaning of putting something in context is going to vary from one software project to another. It may mean deployment to cloud infrastructure, running a new bare metal server, or attempting to find out if your office across the ocean can access the same resources you use. In all cases, keep those short iterations going. Don’t wait and attempt to get a version to 100% before you find out if it works in context. Send it at 80%, see how close you got, then iterate again.
The concept of building in context can be applied at any stage – of course, the sooner the better! Try applying this idea to your project guidance today. I’d love to hear how it goes.
]]>To make the Internet possible, two things that needed imagining are layers and protocols. Layers are conceptual divides that group similar functions together. The word “protocol,” means “the way we’ve agreed to do things around here,” more or less. In short, both layers and protocols can be explained to a five-year-old as “ideas that people agreed sounded good, and then they wrote them down so that other people could do things with the same ideas.”
The Internet Protocol Suite is described in terms of layers and protocols. Collectively, the suite refers to the communication protocols that enable our endless scrolling. It’s often called by its foundational protocols: the Transmission Control Protocol (TCP) and the Internet Protocol (IP). Lumped together as TCP/IP, these protocols describe how data on the Internet is packaged, addressed, sent, and received.
Here’s why the Internet Protocol Suite, or TCP/IP, is an imaginary rainbow layer cake.
If you consider the general nature of a rainbow layer sponge cake, it’s mostly made up of soft, melt-in-your mouth vanilla-y goodness. This goodness is in itself comprised of something along the lines of eggs, butter, flour, and sweetener.
There isn’t much to distinguish one layer of a rainbow sponge cake from another. Often, the only difference between layers is the food-coloring and a bit of frosting. When you think about it, it’s all cake from top to bottom. The rainbow layers are only there because the baker thought they ought to be.
Similar to cake ingredients, layers in the context of computer networking are mostly composed of protocols, algorithms, and configurations, with some data sprinkled in. It can be easier to talk about computer networking if its many functions are split up into groups, so certain people came up with descriptions of layers, which we call network models. TCP/IP is just one network model among others. In this sense, layers are concepts, not things.
Some of the people in question are part of the Internet Engineering Task Force (IETF). They created the RFC-1122 publication, discussing the Internet’s communications layers. Half of a whole, the standard:
…covers the communications protocol layers: link layer, IP layer, and transport layer; its companion RFC-1123 covers the application and support protocols.
The layers described by RFC-1122 and RFC-1123 each encapsulate protocols that satisfy the layer’s functionality. Let’s look at each of these communications layers and see how TCP and IP stack up in this model of the Internet layer cake.
The link layer is the most basic, or lowest-level, classification of communication protocol. It deals with sending information between hosts on the same local network, and translating data from the higher layers to the physical layer. Protocols in the link layer describe how data interacts with the transmission medium, such as electronic signals sent over specific hardware. Unlike other layers, link layer protocols are dependent on the hardware being used.
Protocols in the Internet layer describe how data is sent and received over the Internet. The process involves packaging data into packets, addressing and transmitting packets, and receiving incoming packets of data.
The most widely known protocol in this layer gives TCP/IP its last two letters. IP is a connectionless protocol, meaning that it provides no guarantee that packets are sent or received in the right order, along the same path, or even in their entirety. Reliability is handled by other protocols in the suite, such as in the transport layer.
There are currently two versions of IP in use: IPv4, and IPv6. Both versions describe how devices on the Internet are assigned IP addresses, which are used when navigating to cat memes. IPv4 is more widely used, but has only 32 bits for addressing, allowing for about 4.3 billion (ca. 4.3×109) possible addresses. These are running out, and IPv4 and will eventually suffer from address exhaustion as more and more people use more devices on the Internet.
The successor version IPv6 aims to solve address exhaustion by using 128 bits for addresses. This provides, um, a lot more address possibilities (ca. 3.4×1038).
In May 1974, Vint Cerf and Bob Kahn (collectively often called “the fathers of the Internet”) published a paper entitled A Protocol for Packet Network Intercommunication. This paper contained the first description of a Transmission Control Program, a concept encompassing what would eventually be known as the Transmission Control Protocol (TCP) and User Datagram Protocol (UDP). (I had the pleasure of meeting Vint and can personally confirm that yes, he does look exactly like The Architect in the Matrix movies.)
The transport layer presently encapsulates TCP and UDP. Like IP, UDP is connectionless and can be used to prioritize time over reliability. TCP, on the other hand, is a connection-oriented transport layer protocol that prioritizes reliability over latency, or time. TCP describes transferring data in the same order as it was sent, retransmitting lost packets, and controls affecting the rate of data transmission.
The application layer describes the protocols that software applications interact with most often. The specification includes descriptions of the remote login protocol Telnet, the File Transfer Protocol (FTP), and the Simple Mail Transfer Protocol (SMTP).
Also included in the application layer are the Hypertext Transfer Protocol (HTTP) and its successor, Hypertext Transfer Protocol Secure (HTTPS). HTTPS is secured by Transport Layer Security, or TLS, which can be said to be the top-most layer of the networking model described by the Internet protocol suite. If you’d like to further understand TLS and how this protocol secures your cat meme viewing, I invite you read my article about TLS and cryptography.
Like a still-rising sponge cake, descriptions of layers, better protocols, and new models are being developed every day. The Internet, or whatever it will become in the future, is still in the process of being imagined.
If you enjoyed learning from this post, there’s a lot more where this came from! I write about computing, cybersecurity, and building great technical teams. Subscribe to see new articles first.
]]>About 60 seconds to billions of years, as it turns out.
All Wi-Fi encryption is not created equal. Let’s explore what makes these four acronyms so different, and how you can best protect your home and organization Wi-Fi.
In the beginning, there was WEP.
Wired Equivalent Privacy is a deprecated security algorithm from 1997 that was intended to provide equivalent security to a wired connection. “Deprecated” means, “Let’s not do that anymore.”
Even when it was first introduced, it was known not to be as strong as it could have been, for two reasons: one, its underlying encryption mechanism; and two, World War II.
During World War II, the impact of code breaking (or cryptanalysis) was huge. Governments reacted by attempting to keep their best secret-sauce recipes at home. Around the time of WEP, U.S. Government restrictions on the export of cryptographic technology caused access point manufacturers to limit their devices to 64-bit encryption. Though this was later lifted to 128-bit, even this form of encryption offered a very limited possible key size.
This proved problematic for WEP. The small key size resulted in being easier to brute-force, especially when that key doesn’t often change.
WEP’s underlying encryption mechanism is the RC4 stream cipher. This cipher gained popularity due to its speed and simplicity, but that came at a cost. It’s not the most robust algorithm. WEP employs a single shared key among its users that must be manually entered on an access point device. (When’s the last time you changed your Wi-Fi password? Right.) WEP didn’t help matters either by simply concatenating the key with the initialization vector – which is to say, it sort of mashed its secret-sauce bits together and hoped for the best.
Initialization Vector (IV): fixed-size input to a low-level cryptographic algorithm, usually random.
Combined with the use of RC4, this left WEP particularly susceptible to related-key attack. In the case of 128-bit WEP, your Wi-Fi password can be cracked by publicly-available tools in a matter of around 60 seconds to three minutes.
While some devices came to offer 152-bit or 256-bit WEP variants, this failed to solve the fundamental problems of WEP’s underlying encryption mechanism.
So, yeah. Let’s not do that anymore.
A new, interim standard sought to temporarily “patch” the problem of WEP’s (lack of) security. The name Wi-Fi Protected Access (WPA) certainly sounds more secure, so that’s a good start; however, WPA first started out with another, more descriptive name.
Ratified in a 2004 IEEE standard, Temporal Key Integrity Protocol (TKIP) uses a dynamically-generated, per-packet key. Each packet sent has a unique temporal 128-bit key, (See? Descriptive!) that solves the susceptibility to related-key attacks brought on by WEP’s shared key mashing.
TKIP also implements other measures, such as a message authentication code (MAC). Sometimes known as a checksum, a MAC provides a cryptographic way to verify that messages haven’t been changed. In TKIP, an invalid MAC can also trigger rekeying of the session key. If the access point receives an invalid MAC twice within a minute, the attempted intrusion can be countered by changing the key an attacker is trying to crack.
Unfortunately, in order to preserve compatibility with the existing hardware that WPA was meant to “patch,” TKIP retained the use of the same underlying encryption mechanism as WEP – the RC4 stream cipher. While it certainly improved on the weaknesses of WEP, TKIP eventually proved vulnerable to new attacks that extended previous attacks on WEP. These attacks take a little longer to execute by comparison: for example, twelve minutes in the case of one, and 52 hours in another. This is more than sufficient, however, to deem TKIP no longer secure.
WPA, or TKIP, has since been deprecated as well. So let’s also not do that anymore.
Which brings us to…
Rather than spend the effort to come up with an entirely new name, the improved Wi-Fi Protected Access II (WPA2) standard instead focuses on using a new underlying cipher. Instead of the RC4 stream cipher, WPA2 employs a block cipher called Advanced Encryption Standard (AES) to form the basis of its encryption protocol. The protocol itself, abbreviated CCMP, draws most of its security from the length of its rather long name (I’m kidding): Counter Mode Cipher Block Chaining Message Authentication Code Protocol, which shortens to Counter Mode CBC-MAC Protocol, or CCM mode Protocol, or CCMP. 🤷
CCM mode is essentially a combination of a few good ideas. It provides data confidentiality through CTR mode, or counter mode. To vastly oversimplify, this adds complexity to plaintext data by encrypting the successive values of a count sequence that does not repeat. CCM also integrates CBC-MAC, a block cipher method for constructing a MAC.
AES itself is on good footing. The AES specification was established in 2001 by the U.S. National Institute of Standards and Technology (NIST) after a five-year competitive selection process during which fifteen proposals for algorithm designs were evaluated. As a result of this process, a family of ciphers called Rijndael (Dutch) was selected, and a subset of these became AES. For the better part of two decades, AES has been used to protect every-day Internet traffic as well as certain levels of classified information in the U.S. Government.
While possible attacks on AES have been described, none have yet been proven to be practical in real-world use. The fastest attack on AES in public knowledge is a key-recovery attack that improved on brute-forcing AES by a factor of about four. How long would it take? Some billions of years.
The next installment of the WPA trilogy has been required for new devices since July 1, 2020. Expected to further enhance the security of WPA2, the WPA3 standard seeks to improve password security by being more resilient to word list or dictionary attacks.
Unlike its predecessors, WPA3 will also offer forward secrecy. This adds the considerable benefit of protecting previously exchanged information even if a long-term secret key is compromised. Forward secrecy is already provided by protocols like TLS by using asymmetric keys to establish shared keys. You can learn more about TLS in this post.
As WPA2 has not been deprecated, both WPA2 and WPA3 remain your top choices for Wi-Fi security.
You may be wondering why your access point even allows you to choose an option other than WPA2 or WPA3. The likely reason is that you’re using legacy hardware, which is what tech people call your mom’s router.
Since the deprecation of WEP and WPA occurred (in old-people terms) rather recently, it’s possible in large organizations as well as your parent’s house to find older hardware that still uses these protocols. Even newer hardware may have a business need to support these older protocols.
While I may be able to convince you to invest in a shiny new top-of-the-line Wi-Fi appliance, most organizations are a different story. Unfortunately, many just aren’t yet cognizant of the important role cybersecurity plays in meeting customer needs and boosting that bottom line. Additionally, switching to newer protocols may require new internal hardware or firmware upgrades. Especially on complex systems in large organizations, upgrading devices can be financially or strategically difficult.
If it’s an option, choose WPA2 or WPA3. Cybersecurity is a field that evolves by the day, and getting stuck in the past can have dire consequences.
If you can’t use WPA2 or WPA3, do the best you can to take additional security measures. The best bang for your buck is to use a Virtual Private Network (VPN). Using a VPN is a good idea no matter which type of Wi-Fi encryption you have. On open Wi-Fi (coffee shops) and using WEP, it’s plain irresponsible to go without a VPN. Kind of like shouting out your bank details as you order your second cappuccino.
When possible, ensure you only connect to known networks that you or your organization control. Many cybersecurity attacks are executed when victims connect to an imitation public Wi-Fi access point, also called an evil twin attack, or Wi-Fi phishing. These fake hotspots are easily created using publicly accessible programs and tools. A VPN can help mitigate damage from these attacks as well, but it’s always better not to take the risk. If you travel often, consider purchasing a portable hotspot that uses a cellular data plan, or using data SIM cards for all your devices.
WEP, WPA, WPA2, and WPA3 mean a lot more than a bunch of similar letters – in some cases, it’s a difference of billions of years minus about 60 seconds.
On more of a now-ish timescale, I hope I’ve taught you something new about the security of your Wi-Fi and how you can improve it!
Know someone who’d benefit from some beefed up cybersecurity? Share the cybersecurity starter pack!
]]>TLS, or Transport Layer Security, refers to a protocol. “Protocol” is a word that means, “the way we’ve agreed to do things around here,” more or less. The “transport layer” part of TLS simply refers to host-to-host communication, such as how a client and a server interact, in the Internet protocol suite model.
The TLS protocol attempts to solve these fundamental problems:
Here’s how TLS works, explained in plain English. As with many successful interactions, it begins with a handshake.
The basic process of a TLS handshake involves a client, such as your web browser, and a server, such as one hosting a website, establishing some ground rules for communication. It begins with the client saying hello. Literally. It’s called a ClientHello message.
The ClientHello message tells the server which TLS protocol version and cipher suites it supports. While “cipher suite” sounds like a fancy hotel upgrade, it just refers to a set of algorithms that can be used to secure communications. The server, in a similarly named ServerHello message, chooses the protocol version and cipher suite to use from the choices offered. Other data may also be sent, for example, a session ID if the server supports resuming a previous handshake.
Depending on the cipher suite chosen, the client and server exchange further information in order to establish a shared secret. Often, this process moves the exchange from asymmetric cryptography to symmetric cryptography with varying levels of complexity. Let’s explore these concepts at a general level and see why they matter to TLS.
This is asymmetry:
Asymmetric cryptography is one method by which you can perform authentication. When you authenticate yourself, you answer the fundamental question, “How do I know you are who you say you are?”
In an asymmetric cryptographic system, you use a pair of keys in order to achieve authentication. These keys are asymmetric. One key is your public key, which, as you would guess, is public. The other is your private key, which – well, you know.
Typically, during the TLS handshake, the server will provide its public key via its digital certificate, sometimes still called its SSL certificate, though TLS replaces the deprecated Secure Sockets Layer (SSL) protocol. Digital certificates are provided and verified by trusted third parties known as Certificate Authorities (CA), which are a whole other article in themselves.
While anyone may encrypt a message using your public key, only your private key can then decrypt that message. The security of asymmetric cryptography relies only on your private key staying private, hence the asymmetry. It’s also asymmetric in the sense that it’s a one-way trip. Alice can send messages encrypted with your public key to you, but neither of your keys will help you send an encrypted message to Alice.
Asymmetric cryptography also requires more computational resources than symmetric cryptography. Thus when a TLS handshake begins with an asymmetric exchange, the client and server will use this initial communication to establish a shared secret, sometimes called a session key. This key is symmetric, meaning that both parties use the same shared secret and must maintain that secrecy for the encryption to be secure.
By using the initial asymmetric communication to establish a session key, the client and server can rely on the session key being known only to them. For the rest of the session, they’ll both use this same shared key to encrypt and decrypt messages, which speeds up communication.
A TLS handshake may use asymmetric cryptography or other cipher suites to establish the shared session key. Once the session key is established, the handshaking portion is complete and the session begins.
The session is the duration of encrypted communication between the client and server. During this time, messages are encrypted and decrypted using the session key that only the client and server have. This ensures that communication is secure.
The integrity of exchanged information is maintained by using a checksum. Messages exchanged using session keys have a message authentication code (MAC) attached. This is not the same thing as your device’s MAC address. The MAC is generated and verified using the session key. Because of this, either party can detect if a message has been changed before being received. This solves the fundamental question, “How do I know this message from you hasn’t been tampered with?”
Sessions can end deliberately, due to network disconnection, or from the client staying idle for too long. Once a session ends, it must be re-established via a new handshake or through previously established secrets called session IDs that allow resuming a session.
Let’s recap:
This is just a surface-level skim of the very complex cryptographic systems that help to keep your communications secure. For more depth on the topic, I recommend exploring cipher suites and the various supported algorithms.
The TLS protocol serves a very important purpose in your everyday life. It helps to secure your emails to family, your online banking activities, and the connection by which you’re reading this article. The HTTPS communication protocol is encrypted using TLS. Every time you see that little lock icon in your URL bar, you’re experiencing firsthand all the concepts you’ve just read about in this article. Now you know the answer to the last question: “How can we communicate securely?”
]]>SQLite (“see-quell-lite”) is a lightweight Sequel, or Structured Query Language (SQL), database engine. Instead of using the client-server database management system model, SQLite is self-contained in a single file. It is library, database, and data, all in one package.
For certain applications, SQLite is a solid choice for a production database. It’s lightweight, ultra-portable, and has no external dependencies. Remember when MacBook Air first came out? It’s nothing like that.
SQLite is best suited for production use in applications that:
If your application can benefit from SQLite’s serverless convenience, you may like to know about the different modes available for managing database changes.
POSIX system call fsync()
commits buffered data (data saved in the operating system cache) referred to by a specified file descriptor to permanent storage or disk. This is relevant to understanding the difference between SQLite’s two modes, as fsync()
will block until the device reports the transfer is complete.
For efficiency, SQLite uses atomic commits to batch database changes into a single transaction. This enables the apparent writing of many transactions to a database file simultaneously. Atomic commits are performed using one of two modes: a rollback journal, or a write-ahead log (WAL).
A rollback journal is essentially a back-up file created by SQLite before write changes occur on a database file. It has the advantage of providing high reliability by helping SQLite restore the database to its original state in case a write operation is compromised during the disk-writing process.
Assuming a cold cache, SQLite first needs to read the relevant pages from a database file before it can write to it. Information is read out into the operating system cache, then transferred into user space. SQLite obtains a reserved lock on the database file, preventing other processes from writing to the database. At this point, other processes may still read from the database.
SQLite creates a separate file, the rollback journal, with the original content of the pages that will be changed. Initially existing in the cache, the rollback journal is written to persistent disk storage with fsync()
to enable SQLite to restore the database should its next operations be compromised.
SQLite then obtains an exclusive lock preventing other processes from reading or writing, and writes the page changes to the database file in cache. Since writing to disk is slower than interaction with the cache, writing to disk doesn’t occur immediately. The rollback journal continues to exist until changes are safely written to disk, with a second fsync()
. From a user-space process point of view, the change to the disk (the COMMIT, or end of the transaction) happens instantaneously once the rollback journal is deleted - hence, atomic commits. However, the two fsync()
operations required to complete the COMMIT make this option, from a transactional standpoint, slower than SQLite’s lesser known WAL mode.
While the rollback journal method uses a separate file to preserve the original database state, the WAL method uses a separate WAL file to instead record the changes. Instead of a COMMIT depending on writing changes to disk, a COMMIT in WAL mode occurs when a record of one or more commits is appended to the WAL. This has the advantage of not requiring blocking read or write operations to the database file in order to make a COMMIT, so more transactions can happen concurrently.
WAL mode introduces the concept of the checkpoint, which is when the WAL file is synced to persistent storage before all its transactions are transferred to the database file. You can optionally specify when this occurs, but SQLite provides reasonable defaults. The checkpoint is the WAL version of the atomic commit.
In WAL mode, write transactions are performed faster than in the traditional rollback journal mode. Each transaction involves writing the changes only once to the WAL file instead of twice - to the rollback journal, and then to disk - before the COMMIT signals that the transaction is over.
For medium-sized read-heavy applications, SQLite may be a great choice. Using SQLite in WAL mode may make it an even better one. Benchmarks on the smallest EC2 instance, with no provisioned IOPS, put this little trooper at 400 write transactions per second, and thousands of reads. That’s some perfectly adequate capability, in a perfectly compact package.
]]>Multiple threads in Python is a bit of a bitey subject (not sorry) in that the Python interpreter doesn’t actually let multiple threads execute at the same time. Python’s Global Interpreter Lock, or GIL, prevents multiple threads from executing Python bytecodes at once. Each thread that wants to execute must first wait for the GIL to be released by the currently executing thread. The GIL is pretty much the microphone in a low-budget conference panel, except where no one gets to shout.
This has the advantage of preventing race conditions. It does, however, lack the performance advantages afforded by running multiple tasks in parallel. (If you’d like a refresher on concurrency, parallelism, and multithreading, see Concurrency, parallelism, and the many threads of Santa Claus.) While I prefer Go for its convenient first-class primitives that support concurrency (see Goroutines), this project’s recipients were more comfortable with Python. I took it as an opportunity to test and explore!
Simultaneously performing multiple tasks in Python isn’t impossible; it just takes a little extra work. For Hydra, the main advantage is in overcoming the input/output (I/O) bottleneck.
In order to get web pages to check, Hydra needs to go out to the Internet and fetch them. When compared to tasks that are performed by the CPU alone, going out over the network is comparatively slower. How slow?
Here are approximate timings for tasks performed on a typical PC:
Task | Time | |
---|---|---|
CPU | execute typical instruction | 1/1,000,000,000 sec = 1 nanosec |
CPU | fetch from L1 cache memory | 0.5 nanosec |
CPU | branch misprediction | 5 nanosec |
CPU | fetch from L2 cache memory | 7 nanosec |
RAM | Mutex lock/unlock | 25 nanosec |
RAM | fetch from main memory | 100 nanosec |
Network | send 2K bytes over 1Gbps network | 20,000 nanosec |
RAM | read 1MB sequentially from memory | 250,000 nanosec |
Disk | fetch from new disk location (seek) | 8,000,000 nanosec (8ms) |
Disk | read 1MB sequentially from disk | 20,000,000 nanosec (20ms) |
Network | send packet US to Europe and back | 150,000,000 nanosec (150ms) |
Peter Norvig first published these numbers some years ago in Teach Yourself Programming in Ten Years. Since computers and their components change year over year, the exact numbers shown above aren’t the point. What these numbers help to illustrate is the difference, in orders of magnitude, between operations.
Compare the difference between fetching from main memory and sending a simple packet over the Internet. While both these operations occur in less than the blink of an eye (literally) from a human perspective, you can see that sending a simple packet over the Internet is over a million times slower than fetching from RAM. It’s a difference that, in a single-thread program, can quickly accumulate to form troublesome bottlenecks.
In Hydra, the task of parsing response data and assembling results into a report is relatively fast, since it all happens on the CPU. The slowest portion of the program’s execution, by over six orders of magnitude, is network latency. Not only does Hydra need to fetch packets, but whole web pages! One way of improving Hydra’s performance is to find a way for the page fetching tasks to execute without blocking the main thread.
Python has a couple options for doing tasks in parallel: multiple processes, or multiple threads. These methods allow you to circumvent the GIL and speed up execution in a couple different ways.
To execute parallel tasks using multiple processes, you can use Python’s ProcessPoolExecutor
. A concrete subclass of Executor
from the concurrent.futures
module, ProcessPoolExecutor
uses a pool of processes spawned with the multiprocessing
module to avoid the GIL.
This option uses worker subprocesses that maximally default to the number of processors on the machine. The multiprocessing
module allows you to maximally parallelize function execution across processes, which can really speed up compute-bound (or CPU-bound) tasks.
Since the main bottleneck for Hydra is I/O and not the processing to be done by the CPU, I’m better served by using multiple threads.
Fittingly named, Python’s ThreadPoolExecutor
uses a pool of threads to execute asynchronous tasks. Also a subclass of Executor
, it uses a defined number of maximum worker threads (at least five by default, according to the formula min(32, os.cpu_count() + 4)
) and reuses idle threads before starting new ones, making it pretty efficient.
Here is a snippet of Hydra with comments showing how Hydra uses ThreadPoolExecutor
to achieve parallel multithreaded bliss:
# Create the Checker class
class Checker:
# Queue of links to be checked
TO_PROCESS = Queue()
# Maximum workers to run
THREADS = 100
# Maximum seconds to wait for HTTP response
TIMEOUT = 60
def __init__(self, url):
...
# Create the thread pool
self.pool = futures.ThreadPoolExecutor(max_workers=self.THREADS)
def run(self):
# Run until the TO_PROCESS queue is empty
while True:
try:
target_url = self.TO_PROCESS.get(block=True, timeout=2)
# If we haven't already checked this link
if target_url["url"] not in self.visited:
# Mark it as visited
self.visited.add(target_url["url"])
# Submit the link to the pool
job = self.pool.submit(self.load_url, target_url, self.TIMEOUT)
job.add_done_callback(self.handle_future)
except Empty:
return
except Exception as e:
print(e)
You can view the full code in Hydra’s GitHub repository.
If you’d like to see the full effect, I compared the run times for checking my website between a prototype single-thread program, and the multiheadedmultithreaded Hydra.
time python3 slow-link-check.py https://victoria.dev
real 17m34.084s
user 11m40.761s
sys 0m5.436s
time python3 hydra.py https://victoria.dev
real 0m15.729s
user 0m11.071s
sys 0m2.526s
The single-thread program, which blocks on I/O, ran in about seventeen minutes. When I first ran the multithreaded version, it finished in 1m13.358s - after some profiling and tuning, it took a little under sixteen seconds. Again, the exact times don’t mean all that much; they’ll vary depending on factors such as the size of the site being crawled, your network speed, and your program’s balance between the overhead of thread management and the benefits of parallelism.
The more important thing, and the result I’ll take any day, is a program that runs some orders of magnitude faster.
]]>I’ve been helping out a group called the Open Web Application Security Project (OWASP). They’re a non-profit foundation that produces some of the foremost application testing guides and cybersecurity resources. OWASP’s publications, checklists, and reference materials are a help to security professionals, penetration testers, and developers all over the world. Most of the individual teams that create these materials are run almost entirely by volunteers.
OWASP is a great group doing important work. I’ve seen this firsthand as part of the core team that produces the Web Security Testing Guide. However, while OWASP inspires in its large volunteer base, it lacks in the area of central organization.
This lack of organization was most recently apparent in the group’s website, OWASP.org. A big organization with an even bigger website to match, OWASP.org enjoys hundreds of thousands of visitors. Unfortunately, many of its pages - individually managed by disparate projects - are infrequently updated. Some are abandoned. The website as a whole lacks a centralized quality assurance process, and as a result, OWASP.org is peppered with broken links.
Customers don’t like broken links; attackers really do. That’s because broken links are a security vulnerability. Broken links can signal opportunities for attacks like broken link hijacking and subdomain takeovers. At their least effective, these attacks can be embarrassing; at their worst, severely damaging to businesses and organizations. One OWASP group, the Application Security Verification Standard (ASVS) project, writes about integrity controls that can help to mitigate the likelihood of these attacks. This knowledge, unfortunately, has not yet propagated throughout the rest of OWASP yet.
This is the story of how I created a fast and efficient tool to help OWASP solve this problem.
I took on the task of creating a program that could run as part of a CI/CD process to detect and report broken links. The program needed to:
Essentially; I need to build a web crawler.
My original journey through this process was also in Python, as that was a comfortable language choice for everyone in the OWASP group. Personally, I prefer to use Go for higher performance as it offers more convenient concurrency primitives. Between the task and this talk, I wrote three programs: a prototype single-thread Python program, a multithreaded Python program, and a Go program using goroutines. We’ll see a comparison of how each worked out near the end of the talk - first, let’s explore how to build a web crawler.
Here’s what our web crawler will need to do:
https://victoria.dev
)https://victoria.dev
and not https://github.com
, for instance)Here’s what the execution flow will look like:
As you can see, the nodes “GET page” -> “HTML” -> “Parse links” -> “Valid link” -> “Check visited” all form a loop. These are what enable our web crawler to continue crawling until all the links on the site have been accounted for in the “Check visited” node. When the crawler encounters links it’s already checked, it will “Stop.” This loop will become more important in a moment.
For now, the question on everyone’s mind (I hope): how do we make it fast?
Here are some approximate timings for tasks performed on a typical PC:
Type | Task | Time |
---|---|---|
CPU | execute typical instruction | 1/1,000,000,000 sec = 1 nanosec |
CPU | fetch from L1 cache memory | 0.5 nanosec |
CPU | branch misprediction | 5 nanosec |
CPU | fetch from L2 cache memory | 7 nanosec |
RAM | Mutex lock/unlock | 25 nanosec |
RAM | fetch from main memory | 100 nanosec |
RAM | read 1MB sequentially from memory | 250,000 nanosec |
Disk | fetch from new disk location (seek) | 8,000,000 nanosec (8ms) |
Disk | read 1MB sequentially from disk | 20,000,000 nanosec (20ms) |
Network | send packet US to Europe and back | 150,000,000 nanosec (150ms) |
Peter Norvig first published these numbers some years ago in Teach Yourself Programming in Ten Years. They typically crop up now and then in articles titled along the lines of, “Latency numbers every developer should know.”
Since computers and their components change year over year, the exact numbers shown above aren’t the point. What these numbers help to illustrate is the difference, in orders of magnitude, between operations.
Compare the difference between fetching from main memory and sending a simple packet over the Internet. While both these operations occur in less than the blink of an eye (literally) from a human perspective, you can see that sending a simple packet over the Internet is over a million times slower than fetching from RAM. It’s a difference that, in a single-thread program, can quickly accumulate to form troublesome bottlenecks.
The numbers above mean that the difference in time it takes to send something over the Internet compared to fetching data from main memory is over six orders of magnitude. Remember the loop in our execution chart? The “GET page” node, in which our crawler fetches page data over the network, is going to be a million times slower than the next slowest thing in the loop!
We don’t need to run our prototype to see what that means in practical terms; we can estimate it. Let’s take OWASP.org, which has upwards of 12,000 links, as an example:
150 milliseconds
x 12,000 links
---------
1,800,000 milliseconds (30 minutes)
A whole half hour, just for the network tasks. It may even be much slower than that, since web pages are frequently much larger than a packet. This means that in our single-thread prototype web crawler, our biggest bottleneck is network latency. Why is this problematic?
I previously wrote about feedback loops. In essence, in order to improve at doing anything, you first need to be able to get feedback from your last attempt. That way, you have the necessary information to make adjustments and get closer to your goal on your next iteration.
As a software developer, bottlenecks can contribute to long and inefficient feedback loops. If I’m waiting on a process that’s part of a CI/CD pipeline, in our bottlenecked web crawler example, I’d be sitting around for a minimum of a half hour before learning whether or not changes in my last push were successful, or whether they broke master
(hopefully staging
).
Multiply a slow and inefficient feedback loop by many runs per day, over many days, and you’ve got a slow and inefficient developer. Multiply that by many developers in an organization bottlenecked on the same process, and you’ve got a slow and inefficient company.
To add insult to injury, not only are you waiting on a bottlenecked process to run; you’re also paying to wait. Take the serverless example - AWS Lambda, for instance. Here’s a chart showing the cost of functions by compute time and CPU usage.
Again, the numbers change over the years, but the main concepts remain the same: the bigger the function and the longer its compute time, the bigger the cost. For applications taking advantage of serverless, these costs can add up dramatically.
Bottlenecks are a recipe for failure, for both productivity and the bottom line.
The good news is that bottlenecks are mostly unnecessary. If we know how to identify them, we can strategize our way out of them. To understand how, let’s get some tacos.
Everyone, meet Bob. He’s a gopher who works at the taco stand down the street as the cashier. Say “Hi,” Bob.
🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮
🌮 🌳
🌮
🌮 ╔══════════════╗
🌮 Hi I'm Bob 🌳
🌮 ╚══════════════╝ \
🌮 🐹 🌮
🌮
🌮
🌮 🌳
🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮
Bob works very hard at being a cashier, but he’s still just one gopher. The customers who frequent Bob’s taco stand can eat tacos really quickly; but in order to get the tacos to eat them, they’ve got to order them through Bob. Here’s what our bottlenecked, single-thread taco stand currently looks like:
🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮
🌮 🌳
🌮
🌮
🌮 🌳
🌮 🐹 🧑💵🧑💵🧑💵🧑💵🧑💵🧑💵🧑💵🧑💵🧑💵
🌮
🌮
🌮
🌮 🌳
🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮
As you can see, all the customers are queued up, right out the door. Poor Bob handles one customer’s transaction at a time, starting and finishing with that customer completely before moving on to the next. Bob can only do so much, so our taco stand is rather inefficient at the moment. How can we make Bob faster?
We can try splitting the queue:
🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮
🌮 🌳
🌮
🌮 🧑💵🧑💵🧑💵🧑💵
🌮 🌳
🌮 🐹
🌮
🌮 🧑💵🧑💵🧑💵🧑💵🧑💵
🌮
🌮 🌳
🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮
Now Bob can do some multitasking. For example, he can start a transaction with a customer in one queue; then, while that customer counts their bills, Bob can pop over to the second queue and get started there. This arrangement, known as a concurrency model, helps Bob go a little bit faster by jumping back and forth between lines. However, it’s still just one Bob, which limits our improvement possibilities. If we were to make four queues, they’d all be shorter; but Bob would be very thinly stretched between them. Can we do better?
We could get two Bobs:
🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮
🌮 🌳
🌮
🌮 🌳
🌮 🐹 🧑💵🧑💵🧑💵🧑💵
🌮 🌳
🌮 🐹 🧑💵🧑💵🧑💵🧑💵🧑💵
🌮 🌳
🌮
🌮 🌳
🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮
With twice the Bobs, each can handle a queue of his own. This is our most efficient solution for our taco stand so far, since two Bobs can handle much more than one Bob can, even if each customer is still attended to one at a time.
We can do even better than that:
🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮
🌮 🌳
🌮 🐹 🧑💵🧑💵
🌮 🌳
🌮 🐹 🧑💵🧑💵
🌮 🌳
🌮 🐹 🧑💵🧑💵
🌮 🌳
🌮 🐹 🧑💵🧑💵🧑💵
🌮 🌳
🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮🌮
With quadruple the Bobs, we have some very short queues, and a much more efficient taco stand. In computing, the concept of having multiple workers do tasks in parallel is called multithreading.
In Go, we can apply this concept using goroutines. Here are some illustrative snippets from my Go solution.
In order to share data between our goroutines, we’ll need to create some data structures. Our Checker
structure will be shared, so it will have a Mutex
(mutual exclusion) to allow our goroutines to lock and unlock it. The Checker
structure will also hold a list of brokenLinks
results, and visitedLinks
. The latter will be a map of strings to booleans, which we’ll use to directly and efficiently check for visited links. By using a map instead of iterating over a list, our visitedLinks
lookup will have a constant complexity of O(1) as opposed to a linear complexity of O(n), thus avoiding the creation of another bottleneck. For more on time complexity, see my coffee-break introduction to time complexity of algorithms article.
type Checker struct {
startDomain string
brokenLinks []Result
visitedLinks map[string]bool
workerCount, maxWorkers int
sync.Mutex
}
...
// Page allows us to retain parent and sublinks
type Page struct {
parent, loc, data string
}
// Result adds error information for the report
type Result struct {
Page
reason string
code int
}
To extract links from HTML data, here’s a parser I wrote on top of package html
:
// Extract links from HTML
func parse(parent, data string) ([]string, []string) {
doc, err := html.Parse(strings.NewReader(data))
if err != nil {
fmt.Println("Could not parse: ", err)
}
goodLinks := make([]string, 0)
badLinks := make([]string, 0)
var f func(*html.Node)
f = func(n *html.Node) {
if n.Type == html.ElementNode && checkKey(string(n.Data)) {
for _, a := range n.Attr {
if checkAttr(string(a.Key)) {
j, err := formatURL(parent, a.Val)
if err != nil {
badLinks = append(badLinks, j)
} else {
goodLinks = append(goodLinks, j)
}
break
}
}
}
for c := n.FirstChild; c != nil; c = c.NextSibling {
f(c)
}
}
f(doc)
return goodLinks, badLinks
}
If you’re wondering why I didn’t use a more full-featured package for this project, I highly recommend the story of left-pad
. The short of it: more dependencies, more problems.
Here are snippets of the main
function, where we pass in our starting URL and create a queue (or channels, in Go) to be filled with links for our goroutines to process.
func main() {
...
startURL := flag.String("url", "http://example.com", "full URL of site")
...
firstPage := Page{
parent: *startURL,
loc: *startURL,
}
toProcess := make(chan Page, 1)
toProcess <- firstPage
var wg sync.WaitGroup
The last significant piece of the puzzle is to create our workers, which we’ll do here:
for i := range toProcess {
wg.Add(1)
checker.addWorker()
🐹 go worker(i, &checker, &wg, toProcess)
if checker.workerCount > checker.maxWorkers {
time.Sleep(1 * time.Second) // throttle down
}
}
wg.Wait()
A WaitGroup does just what it says on the tin: it waits for our group of goroutines to finish. When they have, we’ll know our Go web crawler has finished checking all the links on the site.
Here’s a comparison of the three programs I wrote on this journey. First, the prototype single-thread Python version:
time python3 slow-link-check.py https://victoria.dev
real 17m34.084s
user 11m40.761s
sys 0m5.436s
This finished crawling my website in about seventeen-and-a-half minutes, which is rather long for a site at least an order of magnitude smaller than OWASP.org.
The multithreaded Python version did a bit better:
time python3 hydra.py https://victoria.dev
real 1m13.358s
user 0m13.161s
sys 0m2.826s
My multithreaded Python program (which I dubbed Hydra) finished in one minute and thirteen seconds.
How did Go do?
time ./go-link-check --url=https://victoria.dev
real 0m7.926s
user 0m9.044s
sys 0m0.932s
At just under eight seconds, I found the Go version to be extremely palatable.
As fun as it is to simply enjoy the speedups, we can directly relate these results to everything we’ve learned so far. Consider taking a process that used to soak up seventeen minutes and turning it into an eight-second-affair instead. Not only will that give developers a much shorter and more efficient feedback loop, it will give companies the ability to develop faster, and thus grow more quickly - while costing less. To drive the point home: a process that runs in seventeen-and-a-half minutes when it could take eight seconds will also cost over a hundred and thirty times as much to run!
A better work day for developers, and a better bottom line for companies. There’s a lot of benefit to be had in making functions, code, and processes as efficient as possible - by breaking bottlenecks.
]]>There are 7,713,468,100 people in the world in 2019, around 26.3% of which are under 15 years old. This works out to 2,028,642,110 children (persons under 15 years of age) in the world this year.
Santa doesn’t seem to visit children of every religion, so we’ll generalize and only include Christians and non-religious folks. Collectively that makes up approximately 44.72% of the population. If we assume that all kids take after their parents, then 907,208,751.6 children would appear to be Santa-eligible.
What percentage of those children are good? It’s impossible to know; however, we can work on a few assumptions. One is that Santa Claus functions more on optimism than economics and would likely have prepared for the possibility that every child is a good child in any given year. Thus, he would be prepared to give a toy to every child. Let’s assume it’s been a great year and that all 907,208,751.6 children are getting toys.
That’s a lot of presents, and, as we know, they’re all made by Santa’s elves at his North China Pole workshop. Given that there are 365 days in a year and one of them is Christmas, let’s assume that Santa’s elves collectively have 364 days to create and gift wrap 907,208,752 (rounded up) presents. That works out to 2,492,331.74 presents per day.
Almost two-and-a-half million presents per day is a heavy workload for any workshop. Let’s look at two paradigms that Santa might employ to hit this goal: concurrency, and parallelism.
Suppose that Santa’s workshop is staffed by exactly one, very hard working, very tired elf. The production of one present involves four steps:
With a single elf, only one step for one present can be happening at any instance in time. If the elf were to produce one present at a time from beginning to end, that process would be executed sequentially. It’s not the most efficient method for producing two-and-a-half million presents per day; for instance, the elf would have to wait around doing nothing while the glue on the present was drying before moving on to the next step.
In order to be more efficient, the elf works on all presents concurrently.
Instead of completing one present at a time, the elf first cuts all the wood for all the toys, one by one. When everything is cut, the elf assembles and glues the toys together, one after the other. This concurrent processing means that the glue from the first toy has time to dry (without needing more attention from the elf) while the remaining toys are glued together. The same goes for painting, one toy at a time, and finally wrapping.
Since one elf can only do one task at a time, a single elf is using the day as efficiently as possible by concurrently producing presents.
Hopefully, Santa’s workshop has more than just one elf. With more elves, more toys can be built simultaneously over the course of a day. This simultaneous work means that the presents are being produced in parallel. Parallel processing carried out by multiple elves means more work happens at the same time.
Elves working in parallel can also employ concurrency. One elf can still tackle only one task at a time, so it’s most efficient to have multiple elves concurrently producing presents.
Of course, if Santa’s workshop has, say, two-and-a-half million elves, each elf would only need to finish a maximum of one present per day. In this case, working sequentially doesn’t detract from the workshop’s efficiency. There would still be 7,668.26 elves left over to fetch coffee and lunch.
After all the elves’ hard work is done, it’s up to Santa Claus to deliver the presents – all 907,208,752 of them.
Santa doesn’t need to make a visit to every kid; just to the one household tree. So how many trees does Santa need to visit? Again with broad generalization, we’ll say that the average number of children per household worldwide is 2.45, based on the year’s predicted fertility rates. That makes 370,289,286.4 houses to visit. Let’s round that up to 370,289,287.
How long does Santa have? The lore says one night, which means one earthly rotation, and thus 24 hours. NORAD confirms.
This means Santa must visit 370,289,287 households in 24 hours (86,400 seconds), at a rate of 4,285.75 households per second, nevermind the time it takes to put presents under the tree and grab a cookie.
Clearly, Santa doesn’t exist in our dimension. This is especially likely given that despite being chubby and plump, he fits down a chimney (with a lit fire, while remaining unhurt) carrying a sack of toys containing presents for all the household’s children. We haven’t even considered the fact that his sleigh carries enough toys for every believing boy and girl around the world, and flies.
Does Santa exist outside our rules of physics? How could one entity manage to travel around the world, delivering packages, in under 24 hours at a rate of 4,285.75 households per second, and still have time for milk and cookies and kissing mommy?
One thing is certain: Santa uses the Internet. No other technology has yet enabled packages to travel quite so far and quite so quickly. Even so, attempting to reach upwards of four thousand households per second is no small task, even with even the best gigabit Internet hookup the North Pole has to offer. How might Santa increase his efficiency?
There’s clearly only one logical conclusion to this mystery: Santa Claus is a multithreaded process.
Let’s work outward. Think of a thread as one particular task, or the most granular sequence of instructions that Santa might execute. One thread might execute the task, put present under tree
. A thread is a component of a process, in this case, Santa’s process of delivering presents.
If Santa Claus is single-threaded, he, as a process, would only be able to accomplish one task at a time. Since he’s old and a bit forgetful, he probably has a set of instructions for delivering presents, as well as a schedule to abide by. These two things guide Santa’s thread until his process is complete.
Single-threaded Santa Claus might work something like this:
Rinse and repeat… another 370,289,286 times.
Multithreaded Santa Claus, by contrast, is the Doctor Manhattan of the North Pole. There’s still only one Santa Claus in the world; however, he has the amazing ability to multiply his consciousness and accomplish multiple instruction sets of tasks simultaneously. These additional task workers, or worker threads, are created and controlled by the main process of Santa delivering presents.
Each worker thread acts independently to complete its instructions. Since they all belong to Santa’s consciousness, they share Santa’s memory and know everything that Santa knows, including what planet they’re running around on, and where to get the presents from.
With this shared knowledge, each thread is able to execute its set of instructions in parallel with the other threads. This multithreaded parallelism makes the one and only Santa Claus as efficient as possible.
If an average present delivery run takes an hour, Santa need only spawn 4,286 worker threads. With each making one delivery trip per hour, Santa will have completed all 370,289,287 trips by the end of the night.
Of course, in theory, Santa could even spawn 370,289,287 worker threads, each flying to one household to deliver presents for all the children in it! That would make Santa’s process extremely efficient, and also explain how he manages to consume all those milk-dunked cookies without getting full. 🥛🍪🍪🍪
Thanks to modern computing, we now finally understand how Santa Claus manages the seemingly-impossible task of delivering toys to good girls and boys the world-over. From my family to yours, I hope you have a wonderful Christmas. Don’t forget to hang up your stockings on the router shelf.
Of course, none of this explains how reindeer manage to fly.
]]>for
loop, understanding time complexity is an integral milestone to learning how to write efficient complex programs. Think of it as having a superpower that allows you to know exactly what type of program might be the most efficient in a particular situation - before even running a single line of code.
The fundamental concepts of complexity analysis are well worth studying. You’ll be able to better understand how the code you’re writing will interact with the program’s input, and as a result, you’ll spend a lot less wasted time writing slow and problematic code. It won’t take long to go over all you need to know in order to start writing more efficient programs - in fact, we can do it in about fifteen minutes. You can go grab a coffee right now (or tea, if that’s your thing) and I’ll take you through it before your coffee break is over. Go ahead, I’ll wait.
All set? Let’s do it!
The time complexity of an algorithm is an approximation of how long that algorithm will take to process some input. It describes the efficiency of the algorithm by the magnitude of its operations. This is different than the number of times an operation repeats; I’ll expand on that later. Generally, the fewer operations the algorithm has, the faster it will be.
We write about time complexity using Big O notation, which looks something like O(n). There’s rather a lot of math involved in its formal definition, but informally we can say that Big O notation gives us our algorithm’s approximate run time in the worst case, or in other words, its upper bound.[2] It is inherently relative and comparative.[3] We’re describing the algorithm’s efficiency relative to the increasing size of its input data, n. If the input is a string, then n is the length of the string. If it’s a list of integers, n is the length of the list.
It’s easiest to picture what Big O notation represents with a graph:
Here are the main important points to remember as you read the rest of this article:
There are different classes of complexity that we can use to quickly understand an algorithm. I’ll illustrate some of these classes using nested loops and other examples.
A polynomial, from the Greek poly meaning “many,” and Latin nomen meaning “name,” describes an expression comprised of constant variables, and addition, multiplication, and exponentiation to a non-negative integer power.[4] That’s a super math-y way to say that it contains variables usually denoted by letters and symbols that look like these:
The below classes describe polynomial algorithms. Some have food examples.
A constant time algorithm doesn’t change its running time in response to the input data. No matter the size of the data it receives, the algorithm takes the same amount of time to run. We denote this as a time complexity of O(1).
Here’s one example of a constant algorithm that takes the first item in a slice.
func takeCupcake(cupcakes []int) int {
return cupcakes[0]
}
With this constant-time algorithm, no matter how many cupcakes are on offer, you just get the first one. Oh well. Flavours are overrated anyway.
The running duration of a linear algorithm is constant. It will process the input in n number of operations. This is often the best possible (most efficient) case for time complexity where all the data must be examined.
Here’s an example of code with time complexity of O(n):
func eatChips(bowlOfChips int) {
for chip := 0; chip <= bowlOfChips; chip++ {
// dip chip
}
}
Here’s another example of code with time complexity of O(n):
func eatChips(bowlOfChips int) {
for chip := 0; chip <= bowlOfChips; chip++ {
// double dip chip
}
}
It doesn’t matter whether the code inside the loop executes once, twice, or any number of times. Both these loops process the input by a constant factor of n, and thus can be described as linear.
Now here’s an example of code with time complexity of O(n2):
func pizzaDelivery(pizzas int) {
for pizza := 0; pizza <= pizzas; pizza++ {
// slice pizza
for slice := 0; slice <= pizza; slice++ {
// eat slice of pizza
}
}
}
Because there are two nested loops, or nested linear operations, the algorithm process the input n2 times.
Extending on the previous example, this code with three nested loops has time complexity of O(n3):
func pizzaDelivery(boxesDelivered int) {
for pizzaBox := 0; pizzaBox <= boxesDelivered; pizzaBox++ {
// open box
for pizza := 0; pizza <= pizzaBox; pizza++ {
// slice pizza
for slice := 0; slice <= pizza; slice++ {
// eat slice of pizza
}
}
}
}
A logarithmic algorithm is one that reduces the size of the input at every step. We denote this time complexity as O(log n), where log, the logarithm function, is this shape:
One example of this is a binary search algorithm that finds the position of an element within a sorted array. Here’s how it would work, assuming we’re trying to find the element x:
I find the clearest analogy for understanding binary search is imagining the process of locating a book in a bookstore aisle. If the books are organized by author’s last name and you want to find “Terry Pratchett,” you know you need to look for the “P” section.
You can approach the shelf at any point along the aisle and look at the author’s last name there. If you’re looking at a book by Neil Gaiman, you know you can ignore all the rest of the books to your left, since no letters that come before “G” in the alphabet happen to be “P.” You would then move down the aisle to the right any amount, and repeat this process until you’ve found the Terry Pratchett section, which should be rather sizable if you’re at any decent bookstore because wow did he write a lot of books.
Often seen with sorting algorithms, the time complexity O(n log n) can describe a data structure where each operation takes O(log n) time. One example of this is quick sort, a divide-and-conquer algorithm.
Quick sort works by dividing up an unsorted array into smaller chunks that are easier to process. It sorts the sub-arrays, and thus the whole array. Think about it like trying to put a deck of cards in order. It’s faster if you split up the cards and get five friends to help you.
The below classes of algorithms are non-polynomial.
An algorithm with time complexity O(n!) often iterates through all permutations of the input elements. One common example is a brute-force search seen in the travelling salesman problem. It tries to find the least costly path between a number of points by enumerating all possible permutations and finding the ones with the lowest cost.
An exponential algorithm often also iterates through all subsets of the input elements. It is denoted O(2n) and is often seen in brute-force algorithms. It is similar to factorial time except in its rate of growth, which as you may not be surprised to hear, is exponential. The larger the data set, the more steep the curve becomes.
In cryptography, a brute-force attack may systematically check all possible elements of a password by iterating through subsets. Using an exponential algorithm to do this, it becomes incredibly resource-expensive to brute-force crack a long password versus a shorter one. This is one reason that a long password is considered more secure than a shorter one.
There are further time complexity classes less commonly seen that I won’t cover here, but you can read about these and find examples in this handy table.
As I described in my article explaining recursion using apple pie, a recursive function calls itself under specified conditions. Its time complexity depends on how many times the function is called and the time complexity of a single function call. In other words, it’s the product of the number of times the function runs and a single execution’s time complexity.
Here’s a recursive function that eats pies until no pies are left:
func eatPies(pies int) int {
if pies == 0 {
return pies
}
return eatPies(pies - 1)
}
The time complexity of a single execution is constant. No matter how many pies are input, the program will do the same thing: check to see if the input is 0. If so, return, and if not, call itself with one fewer pie.
The initial number of pies could be any number, and we need to process all of them, so we can describe the input as n. Thus, the time complexity of this recursive function is the product O(n).
So far, we’ve talked about the time complexity of a few nested loops and some code examples. Most algorithms, however, are built from many combinations of these. How do we determine the time complexity of an algorithm containing many of these elements strung together?
Easy. We can describe the total time complexity of the algorithm by finding the largest complexity among all of its parts. This is because the slowest part of the code is the bottleneck, and time complexity is concerned with describing the worst case for the algorithm’s run time.
Say we have a program for an office party. If our program looks like this:
package main
import "fmt"
func takeCupcake(cupcakes []int) int {
fmt.Println("Have cupcake number",cupcakes[0])
return cupcakes[0]
}
func eatChips(bowlOfChips int) {
fmt.Println("Have some chips!")
for chip := 0; chip <= bowlOfChips; chip++ {
// dip chip
}
fmt.Println("No more chips.")
}
func pizzaDelivery(boxesDelivered int) {
fmt.Println("Pizza is here!")
for pizzaBox := 0; pizzaBox <= boxesDelivered; pizzaBox++ {
// open box
for pizza := 0; pizza <= pizzaBox; pizza++ {
// slice pizza
for slice := 0; slice <= pizza; slice++ {
// eat slice of pizza
}
}
}
fmt.Println("Pizza is gone.")
}
func eatPies(pies int) int {
if pies == 0 {
fmt.Println("Someone ate all the pies!")
return pies
}
fmt.Println("Eating pie...")
return eatPies(pies - 1)
}
func main() {
takeCupcake([]int{1, 2, 3})
eatChips(23)
pizzaDelivery(3)
eatPies(3)
fmt.Println("Food gone. Back to work!")
}
We can describe the time complexity of all the code by the complexity of its most complex part. This program is made up of functions we’ve already seen, with the following time complexity classes:
Function | Class | Big O |
---|---|---|
takeCupcake |
constant | O(1) |
eatChips |
linear | O(n) |
pizzaDelivery |
cubic | O(n3) |
eatPies |
linear (recursive) | O(n) |
To describe the time complexity of the entire office party program, we choose the worst case. This program would have the time complexity O(n3).
Here’s the office party soundtrack, just for fun.
Have cupcake number 1
Have some chips!
No more chips.
Pizza is here!
Pizza is gone.
Eating pie...
Eating pie...
Eating pie...
Someone ate all the pies!
Food gone. Back to work!
You may come across these terms in your explorations of time complexity. Informally, P (for Polynomial time), is a class of problems that is quick to solve. NP, for Nondeterministic Polynomial time, is a class of problems where the answer can be quickly verified in polynomial time. NP encompasses P, but also another class of problems called NP-complete, for which no fast solution is known.[5] Outside of NP but still including NP-complete is yet another class called NP-hard, which includes problems that no one has been able to verifiably solve with polynomial algorithms.[6]
P versus NP is an unsolved, open question in computer science.
Anyway, you don’t generally need to know about NP and NP-hard problems to begin taking advantage of understanding time complexity. They’re a whole other Pandora’s box.
So far, we’ve identified some different time complexity classes and how we might determine which one an algorithm falls into. So how does this help us before we’ve written any code to evaluate?
By combining a little knowledge of time complexity with an awareness of the size of our input data, we can take a guess at an efficient algorithm for processing our data within a given time constraint. We can base our estimation on the fact that a modern computer can perform some hundreds of millions of operations in a second.[1] The following table from the Competitive Programmer’s Handbook offers some estimates on required time complexity to process the respective input size in a time limit of one second.
Input size | Required time complexity for 1s processing time |
---|---|
n ≤ 10 | O(n!) |
n ≤ 20 | O(2n) |
n ≤ 500 | O(n3) |
n ≤ 5000 | O(n2) |
n ≤ 106 | O(n log n) or O(n) |
n is large | O(1) or O(log n) |
Keep in mind that time complexity is an approximation, and not a guarantee. We can save a lot of time and effort by immediately ruling out algorithm designs that are unlikely to suit our constraints, but we must also consider that Big O notation doesn’t account for constant factors. Here’s some code to illustrate.
The following two algorithms both have O(n) time complexity.
func makeCoffee(scoops int) {
for scoop := 0; scoop <= scoops; scoop++ {
// add instant coffee
}
}
func makeStrongCoffee(scoops int) {
for scoop := 0; scoop <= 3*scoops; scoop++ {
// add instant coffee
}
}
The first function makes a cup of coffee with the number of scoops we ask for. The second function also makes a cup of coffee, but it triples the number of scoops we ask for. To see an illustrative example, let’s ask both these functions for a cup of coffee with a million scoops.
Here’s the output of the Go test:
Benchmark_makeCoffee-4 1000000000 0.29 ns/op
Benchmark_makeStrongCoffee-4 1000000000 0.86 ns/op
Our first function, makeCoffee
, completed in an average 0.29 nanoseconds. Our second function, makeStrongCoffee
, completed in an average of 0.86 nanoseconds. While those may both seem like pretty small numbers, consider that the stronger coffee took near three times longer to make. This should make sense intuitively, since we asked it to triple the scoops. Big O notation alone wouldn’t tell you this, since the constant factor of the tripled scoops isn’t accounted for.
Becoming familiar with time complexity gives us the opportunity to write code, or refactor code, to be more efficient. To illustrate, I’ll give a concrete example of one way we can refactor a bit of code to improve its time complexity.
Let’s say a bunch of people at the office want some pie. Some people want pie more than others. The amount that everyone wants some pie is represented by an int
> 0:
diners := []int{2, 88, 87, 16, 42, 10, 34, 1, 43, 56}
Unfortunately, we’re bootstrapped and there are only three forks to go around. Since we’re a cooperative bunch, the three people who want pie the most will receive the forks to eat it with. Even though they’ve all agreed on this, no one seems to want to sort themselves out and line up in an orderly fashion, so we’ll have to make do with everybody jumbled about.
Without sorting the list of diners, return the three largest integers in the slice.
Here’s a function that solves this problem and has O(n2) time complexity:
func giveForks(diners []int) []int {
// make a slice to store diners who will receive forks
var withForks []int
// loop over three forks
for i := 1; i <= 3; i++ {
// variables to keep track of the highest integer and where it is
var max, maxIndex int
// loop over the diners slice
for n := range diners {
// if this integer is higher than max, update max and maxIndex
if diners[n] > max {
max = diners[n]
maxIndex = n
}
}
// remove the highest integer from the diners slice for the next loop
diners = append(diners[:maxIndex], diners[maxIndex+1:]...)
// keep track of who gets a fork
withForks = append(withForks, max)
}
return withForks
}
This program works, and eventually returns diners [88 87 56]
. Everyone gets a little impatient while it’s running though, since it takes rather a long time (about 120 nanoseconds) just to hand out three forks, and the pie’s getting cold. How could we improve it?
By thinking about our approach in a slightly different way, we can refactor this program to have O(n) time complexity:
func giveForks(diners []int) []int {
// make a slice to store diners who will receive forks
var withForks []int
// create variables for each fork
var first, second, third int
// loop over the diners
for i := range diners {
// assign the forks
if diners[i] > first {
third = second
second = first
first = diners[i]
} else if diners[i] > second {
third = second
second = diners[i]
} else if diners[i] > third {
third = diners[i]
}
}
// list the final result of who gets a fork
withForks = append(withForks, first, second, third)
return withForks
}
Here’s how the new program works:
Initially, diner 2
(the first in the list) is assigned the first
fork. The other forks remain unassigned.
Then, diner 88
is assigned the first fork instead. Diner 2
gets the second
one.
Diner 87
isn’t greater than first
which is currently 88
, but it is greater than 2
who has the second
fork. So, the second
fork goes to 87
. Diner 2
gets the third
fork.
Continuing in this violent and rapid fork exchange, diner 16
is then assigned the third
fork instead of 2
, and so on.
We can add a print statement in the loop to see how the fork assignments play out:
0 0 0
2 0 0
88 2 0
88 87 2
88 87 16
88 87 42
88 87 42
88 87 42
88 87 42
88 87 43
[88 87 56]
This program is much faster, and the whole epic struggle for fork domination is over in 47 nanoseconds.
As you can see, with a little change in perspective and some refactoring, we’ve made this simple bit of code faster and more efficient.
Well, it looks like our fifteen minute coffee break is up! I hope I’ve given you a comprehensive introduction to calculating time complexity. Time to get back to work, hopefully applying your new knowledge to write more effective code! Or maybe just sound smart at your next office party. :)
“If I have seen further it is by standing on the shoulders of Giants.” –Isaac Newton, 1675
The example code is JavaScript, since that’s what I’ve been working in lately, but I believe the concepts to be pretty universal.
This… | …is the same as this |
---|---|
i++; |
i = i + 1; |
i--; |
i = i - 1; |
apples += 5 |
apples = apples + 5; |
apples -= 5 |
apples = apples - 5; |
apples *= 5 |
apples = apples * 5; |
apples /= 5 |
apples = apples / 5; |
This… | …gives this |
---|---|
3 == '3' |
true (type converted) |
3 === '3' |
false (type matters; integer is not a string) |
3 != '3' |
false (type converted, 3: 3) |
3 !== '3' |
true (type matters; integer is not a string) |
|| | logical “or”: either side evaluated |
&& |
logical “and”: both sides evaluated |
Given a breakfast object that looks like this:
var breakfast = {
'eggs': 2,
'waffles': 2,
'fruit': {
'blueberries': 5,
'strawberries': 1,
},
'coffee': 1
}
Or like this:
We can iterate through each breakfast item using a for loop as follows:
for (item in breakfast) {
console.log('item: ', item);
}
This produces:
item: eggs
item: waffles
item: fruit
item: coffee
We can access the value of the property or nested properties (in this example, the number of items) like this:
console.log('How many waffles? ', breakfast['waffles'])
console.log('How many strawberries? ', breakfast['fruit']['strawberries'])
Or equivalent syntax:
console.log('How many waffles? ', breakfast.waffles)
console.log('How many strawberries? ', breakfast.fruit.strawberries)
This produces:
How many waffles? 2
How many strawberries? 1
If instead I want to access the property via the value, for example, to find out which items are served in twos, I can do so by iterating like this:
for (item in breakfast) {
if (breakfast[item] == 2) {
console.log('Two of: ', item);
}
}
Which gives us:
Two of: eggs
Two of: waffles
Say I want to increase the number of fruits in breakfast, because sugar is bad for me and I like things that are bad for me. I can do that like this:
var fruits = breakfast['fruit'];
for (f in fruits) {
fruits[f] += 1;
}
console.log(fruits);
Which gives us:
{ blueberries: 6, strawberries: 2 }
Given an array of waffles that looks like this:
var wafflesIAte = [ 1, 3, 2, 0, 5, 2, 11 ];
Or like this:
We can iterate through each item in the array using a for loop:
for (var i = 0; i < wafflesIAte.length; i++) {
console.log('array index: ', i);
console.log('item from array: ', wafflesIAte[i]);
}
This produces:
array index: 0
item from array: 1
array index: 1
item from array: 3
array index: 2
item from array: 2
array index: 3
item from array: 0
array index: 4
item from array: 5
array index: 5
item from array: 2
array index: 6
item from array: 11
Some things to remember:
i
in the above context is a placeholder; we could substitute anything we like (x
, n
, underpants
, etc). It simply denotes each instance of the iteration.
i < wafflesIAte.length
tells our for loop to continue as long as i
is less than the array’s length (in this case, 7).
i++
is equivalent to i+1
and means we’re incrementing through our array by one each time. We could also use i+2
to proceed with every other item in the array, for example.
We can specify an item in the array using the array index, written as wafflesIAte[i]
where i
is any index of the array. This gives the item at that location.
Array index always starts with 0
, which is accessed with wafflesIAte[0]
. Using wafflesIAte[1]
gives us the second item in the array, which is “3”.
Remember that wafflesIAte.length
and the index of the last item in the array are different. The former is 7, the latter is 6
.
When incrementing i
, remember that [i+1]
and [i]+1
are different:
console.log('[i+1] gives next array index: ', wafflesIAte[0+1]);
console.log('[i]+1 gives index value + 1: ', wafflesIAte[0]+1);
Produces:
[i+1] gives next array index: 3
[i]+1 gives index value + 1: 2
The more often you code and correct your errors, the better you’ll remember it next time!
That’s all for now. If you have a correction, best practice, or another common error for me to add, please let me know!
]]>