CURL

De JFCM
Aller à : navigation, rechercher


Everything curl is an extensive guide to everything there is to know about curl, the project, the command-line tool, the library, how everything started and how it came to be what it is today. How we work on developing it further, what it takes to use it, how you can contribute with code and bug reports and why all those millions of existing users use it.

This book is meant to be interesting and useful to both casual readers and the somewhat more experienced developers, and offers something for you all to pick and choose from. Don't read it from front to back. Read the chapters you are curious about and go back and forth as you see fit.

I hope to run this book project as I do all other projects I work on: in the open, completely free to download and read, free for anyone to comment on, available for everyone to contribute to and help out with. Send your bug reports, pull requests or critiques to me and I will improve this book accordingly.

This book will never be finished. I intend to keep working on it and while I may at some point in time consider it fairly complete and covering most aspects of the project (even if only that seems like an insurmountable goal), the curl project will continue to move so there will always be things to update in the book as well.

This book project started at the end of September 2015.

The author 
With the hope of becoming just a co-author of this material, I am Daniel Stenberg. I founded the curl project. I'm a developer at heart, for fun and profit. I live and work in Stockholm, Sweden.


Sommaire


Introduction[modifier]

All there is to know about me can be found on my web site .

Help![modifier]

If you find mistakes, omissions, errors or blatant lies in this document, please send me a refreshed version of the affected paragraph and I will make amended versions. I will give proper credits to everyone who helps out! I hope to make this document better over time.

Preferably, you submit errors or pull requests on the book's github page.

License[modifier]

This document is licensed under the Creative Commons Attribution 4.0 license .

How to read this book[modifier]

Here is an overview of the main sections of this book and what they cover.

1. The cURL project
2. Open Source
3. The source code
4. Network and protocols
5. Command line basics
6. Using curl
7. How to HTTP with curl
8. Building and installing
9. libcurl basics
10. HTTP with libcurl
11. Bindings
12. libcurl internals
13. Index

How it started[modifier]

Back in 1996, Daniel Stenberg was writing an IRC bot in his spare time, an automated program that would offer services for the participants in a chatroom dedicated to the Amiga computer (#amiga on the IRC network EFnet). He came to think that it would be fun to get some updated currency rates and have his bot offer a service online for the chat room users to get current exchange rates, to ask the bot "please exchange 200 USD into SEK" or similar.

In order to have the provided exchange rates as accurate as possible, the bot would download the rates daily from a web site that was hosting them. A small tool to download data over HTTP was needed for this task. A quick look-around at the time had Daniel find a tiny tool named httpget (written by a Brazilian named Rafael Sagula). It did the job, almost, just needed a few little a tweaks here and there and soon Daniel had taken over maintenance of the few hundred lines of code it was.

HttpGet 1.0 was subsequently released on April 8th 1997 with brand new HTTP proxy support.

We soon found and fixed support for getting currencies over GOPHER. Once FTP download support was added, the name of the project was changed and urlget 2.0 was released in August 1997. The http-only days were already passed.

The project slowly grew bigger. When upload capabilities were added and the name once again was misleading, a second name change was made and on March 20, 1998 curl 4 was released. (The version numbering from the previous names was kept.)

We consider March 20 1998 to be curl's birthday.

The name[modifier]

Naming things is hard.

The tool was about uploading and downloading data specified with a URL. It would show the data (by default). The user would "see" the URL perhaps and "see" then spelled with the single letter 'c'. It was also a client-side program, a URL client. So 'c' for Client and URL: cURL .

Nothing more was needed so the name was selected and we never looked back again.

Later on, someone suggested that curl could actually be a clever "recursive acronym" (where the first letter in the acronym refers back to the same word): "Curl URL Request Library"

While that is awesome, it was actually not the original thought. We sort of wish we were that clever though…

There are and were other projects using the name curl in various ways, but we were not aware of them by the time our curl came to be.

Confusions and mixups[modifier]

Soon after curl was first created another "curl" appeared that makes a programming language. That curl still exists .

Several libcurl bindings for various programming languages use the term "curl" or "CURL" in part or completely to describe their bindings, so sometimes you will find users talking about curl but targeting neither the command-line tool nor the library that is made by this project.

As a verb[modifier]

'to curl something' is sometimes used as a reference to use a non-browser tool to download a file or resource from a URL.

What does curl do?[modifier]

cURL is a project and its primary purpose and focus is to make two products:

1)curl, the command-line tool

2)libcurl the transfer library with a C API

Both the tool and the library do Internet transfers for resources specified as URLs using Internet protocols.

Everything and anything that is related to Internet protocol transfers can be considered curl's business. Things that are not related to that should be avoided and be left for other projects and products.

It could be important to also consider that curl and libcurl try to avoid handling the actual data that is transferred. It has, for example, no knowledge about HTML or anything else of the content that is popular to transfer over HTTP, but it knows all about how to transfer such data over HTTP.

Both products are frequently used not only to drive thousands or millions of scripts and applications for an Internet connected world, but they are also widely used for server testing, protocol fiddling and trying out new things.

The library is used in every imaginable sort of embedded device where Internet transfers are needed: car infotainment, televisions, Blu-Ray players, set-top boxes, printers, routers, game systems, etc.

Command line tool[modifier]

Running curl from the command line was natural and Daniel never considered anything else than that it would output data on stdout, to the terminal, by default. The everything is a pipe" mantra of standard Unix philosophy was something Daniel believed in. curl is like 'cat' or one of the other Unix tools; it sends data to stdout to make it easy to chain together with other tools to do what you want. That's also why virtually all curl options that allow reading from a file or writing to a file, also have the ability to select doing it to stdout or from stdin.

Following that style of what Unix command-line tools worked, it was also never any question about that it should support multiple URLs on the command line.

The command-line tool is designed to work perfectly from scripts or other automatic means. It doesn't feature any other GUI or UI other than mere text in and text out.


What does curl do?[modifier]

The library[modifier]

While the command-line tool came first, the network engine was ripped out and converted into a library during the year 2000 and the concepts we still have today were introduced with libcurl 7.1 in August 2000. Since then, the command line tool has been a thin layer of logic to make a tool around the library that does all the heavy lifting.

libcurl is designed and meant to be available for anyone who wants to add client-side file transfer capabilities to their software, on any platform, any architecture and for any purpose. libcurl is also extremely liberally licensed to avoid that becoming an obstacle.

libcurl is written in traditional and conservative C. Where other languages are preferred, people have created libcurl bindings for them.


Project communication[modifier]

cURL is an Open Source project consisting of voluntary members from all over the world, living and working in a large number of the world's time zones. To make such a setup actually work, communication and openness is key. We keep all communication public and we use open communication channels. Most discussions are held on mailing lists, we use bug trackers where all issues are discussed and handled with full insight for everyone who cares to look.

It is important to realize that we are all jointly taking care of the project, we fix problems and we add features. Sometimes a regular contributor grows bored and fades away, sometimes a new eager contributor steps out from the shadows and starts helping out more. To keep this ship going forward as well as possible, it is important that we maintain open discussions and that's one of the reasons why we frown upon users who take discussions privately or try to e-mail individual team members about development issues, questions, debugging or whatever.

In this day and age, mailing lists may be considered sort of the old style of communication— no fancy web forums or similar. Using a mailing list is therefore becoming an art that isn't practised everywhere and may be a bit strange and unusual to you. But fear not. It is just about sending emails to an address that then sends that e-mail out to all the subscribers. Our mailing lists have at most a few thousand subscribers. If you are mailing for the first time, it might be good to read a few old mails first to get to learn the culture and what's considered good practice.

The mailing lists and the bug tracker have changed hosting providers a few times and there are reasons to suspect it might happen again in the future. It is just the kind of thing that happens to a project that lives for a long time.

A few users also hang out on IRC in the #curl channel on freenode.

Mailing list etiquette[modifier]

Like many communities and subcultures, we have developed guidelines and rules of what we think is the right way to behave and how to communicate on the mailing lists. The curl mailing list etiquette follows the style of traditional Open Source projects.

Do not mail a single individual[modifier]

Many people send one question directly to one person. One person gets many mails, and there is only one person who can give you a reply. The question may be something that other people also want to ask. These other people have no way to read the reply but to ask the one person the question. The one person consequently gets overloaded with mail.

If you really want to contact an individual and perhaps pay for his or her services, by all means go ahead, but if it's just another curl question, take it to a suitable list instead.

Reply or new mail[modifier]

Please do not reply to an existing message as a shortcut to post a message to the lists.

Many mail programs and web archivers use information within mails to keep them together as "threads", as collections of posts that discuss a certain subject. If you don't intend to reply on the same or similar subject, don't just hit reply on an existing mail and change subject; create a new mail.

Reply to the list[modifier]

When replying to a message from the list, make sure that you do "group reply" or "reply to all", and not just reply to the author of the single mail you reply to.

We are actively discouraging replying back to the single person by setting the Reply-To: field in outgoing mails back to the mailing list address, making it harder for people to mail the author only by mistake.

Use a sensible subject[modifier]

Please use a subject of the mail that makes sense and that is related to the contents of your mail. It makes it a lot easier to find your mail afterwards and it makes it easier to track mail threads and topics.

Do not top-post[modifier]

$Mailing list etiquette

If you reply to a message, don't use top-posting. Top-posting is when you write the new text at the top of a mail and you insert the previous quoted mail conversation below. It forces users to read the mail in a backwards order to properly understand it.

This is why top posting is so bad:

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

Apart from the screwed-up read order (especially when mixed together in a thread when someone responds using the mandated bottom-posting style), it also makes it impossible to quote only parts of the original mail.

When you reply to a mail you let the mail client insert the previous mail quoted. Then you put the cursor on the first line of the mail and you move down through the mail, deleting all parts of the quotes that don't add context for your comments. When you want to add a comment you do so, inline, right after the quotes that relate to your comment. Then you continue downwards again.

When most of the quotes have been removed and you have added your own words, you are done!

HTML is not for mails[modifier]

Please switch off those HTML encoded messages. You can mail all those funny mails to your friends. We speak plain text mails.

Quoting[modifier]

Quote as little as possible. Just enough to provide the context you cannot leave out. A lengthy description can be found here .

Digest[modifier]

We allow subscribers to subscribe to the "digest" version of the mailing lists. A digest is a collection of mails lumped together in one single mail.

Should you decide to reply to a mail sent out as a digest, there are two things you MUST consider if you really really cannot subscribe normally instead:

Cut off all mails and chatter that is not related to the mail you want to reply to.

$Mailing list etiquette

Change the subject name to something sensible and related to the subject, preferably even the actual subject of the single mail you wanted to reply to.

Please tell us how you solved the problem![modifier]

Many people mail questions to the list, people spend some of their time and make an effort in providing good answers to these questions.

If you are the one who asks, please consider responding once more in case one of the hints was what solved your problems. The guys who write answers feel good to know that they provided a good answer and that you fixed the problem. Far too often, the person who asked the question is never heard of again, and we never get to know if he/she is gone because the problem was solved or perhaps because the problem was unsolvable!

Getting the solution posted also helps other users that experience the same problem(s). They get to see (possibly in the web archives) that the suggested fixes actually has helped at least one person.

Mailing lists[modifier]

Some of the most important mailing lists are…

curl-users[modifier]

The main mailing list for users and developers of the curl command-line tool, for questions and help around curl concepts, command-line options, the protocols curl can speak or even related tools. We tend to move development issues or more advanced bug fixes discussions over to curl-library instead, since libcurl is the engine that drives most of curl.

See https://cool.haxx.se/mailman/listinfo/curl-users

curl-library[modifier]

The main development list, and also for users of libcurl. We discuss how to use libcurl in applications as well as development of libcurl itself. You will find lots of questions on libcurl behavior, debugging and documentation issues.

See https://cool.haxx.se/mailman/listinfo/curl-library

curl-announce[modifier]

This mailing list only gets announcements about new releases and security problems— nothing else. This one is for those who want a more casual feed of information from the project. https://cool.haxx.se/mailman/listinfo/curl-announce

Reporting bugs[modifier]

The development team does a lot of testing. We have a whole test suite that is run frequently every day on numerous platforms to in order to exercise all code and make sure everything works as supposed.

Still, there are times when things aren't working the way they should. Then we appreciate getting those problems reported.

A bug is a problem[modifier]

Any problem can be considered a bug. A weirdly phrased wording in the manual that prevents you from understanding something is a bug. A surprising side effect of combining multiple options can be a bug—or perhaps it should be better documented? Perhaps the option doesn't do at all what you expected it to? That's a problem and we should fix it!

Problems must be known to get fixed

This may sound easy and uncomplicated but is a fundamental truth in our and other projects. Just because it is an old project and have thousands of users doesn't mean that the development team knows about the problem you just fell over. Maybe users haven't paid enough attention to details like you, or perhaps it just never triggered for anyone else.

We rely on users experiencing problems to report them. We need to learn the problems exist so that we can fix them.

Fixing the problems[modifier]

Software engineering is, to a very large degree, about fixing problems. To fix a problem a developer needs to understand how to repeat it and to do that the debugging person needs to be told what set of circumstances that made the problem trigger.

A good bug report[modifier]

A good report explains what happened and what you thought was going to happen. Tell us exactly what versions of the different components you used and take us step by step through what you do to get the problem.

Reporting bugs[modifier]

After you submit a bug report, you can expect there to be follow-up questions or perhaps requests that you try out varies things and tasks in order for the developer to be able to narrow down the suspects and make sure your problem is being cornered in properly.

A bug report that is submitted but is abandoned by the submitter risks getting closed if the developer fails to understand it, fails to reproduce it or faces other problems when working on it. Don't abandon your report!

Report curl bugs in the curl bug tracker on github !

Testing[modifier]

Testing software thoroughly and properly is a lot of work. Testing software that runs on dozens on operating systems and dozens of CPU architectures, with server implementations with their owns sets of bugs and interpretations of the specs, is even more work.

The curl project has a test suite that iterates over all existing test cases, runs the test and verifies that the outcome is the correct one and that no other problem happened, like a memory leak or something fishy in the protocol layer.

The test suite is meant to be possible to run after you have built curl yourself and there are a fair number of volunteers who also help out by running the test suite automatically a few times per day to make sure the latest commits get a run. This way, we hopefully discover the worst flaws pretty soon after they were introduced.

We don't test everything and even when we try to test things there will always be subtle details that get through and that we, sometimes years after the fact, figure out were wrong.

Due to the nature of different systems and funny use cases on the Internet, eventually some of the best testing is done by users when they run the code to perform their own use cases.

Another limiting factor with the test suite is that the test setup itself is less portable than curl and libcurl so there are in fact platforms where curl runs fine but the test suite cannot execute at all.

Releases[modifier]

A release in the curl project means packaging up all the source code that is in the master branch of the code repository, signing the package, tagging the point in time in the code repository, and then putting it up on the web site for the world to download.

It is one single source code archive for all platforms curl can run on. It is the one and only package for both curl and libcurl.

We never ship any curl or libcurl binaries from the project. All the packaged binaries that are provided with operating systems or on other download sites are done by gracious volunteers outside of the project.

As of a few years back, we make an effort to do our releases on an eight week cycle and unless some really serious and urgent problem shows up we stick to this schedule. We release on a Wednesday, and then again a Wednesday eight weeks later and so it continues. Non-stop.

For every release we tag the source code in the repository with "curl-release version" and we update the changelog .

We had done 160 curl releases by November 2016, and for all the ones made since late 1999 there are lots of release stats available in our curl release log .

Daily snapshots[modifier]

Every single change to the source code is committed and pushed to the source code repository. This repository is hosted on github.com and is using git these days (but hasn't always been this way). When building curl off the repository, there are a few things you need to generate and setup that sometimes cause people some problems or just friction. To help with that, we provide daily snapshots.

The daily snapshots are generated daily (clever naming, right?) as if a release had been made at that point in time. It produces a package of all sources code and all files that are normally part of a release and puts it in a package and uploads it to a special place

( https://curl.haxx.se/snapshots/ ) to allow interested people to get the very latest code to test, to experiment or whatever.

The snapshots are only kept for around 20 days until deleted.

Security[modifier]

Security is a primary concern for us in the curl project. We take it seriously and we work hard on providing secure and safe implementations of all protocols and related code. As soon as we get knowledge about a security related problem or just a suspected problem, we deal with it and we will attempt to provide a fix and security notice no later than in the next pending release.

We use a responsible disclosure policy, meaning that we prefer to discuss and work on security fixes out of the public eye and we alert the vendors on the openwall.org list a few days before we announce the problem and fix to the world. This, in an attempt to shorten the time span the bad guys can take advantage of a problem until a fixed version has been deployed.

Past security problems[modifier]

During the years we have had our fair share of security related problems. We work hard on documenting every problem thoroughly with all details listed and clearly stated to aid users. Users of curl should be able to figure out what problems their particular curl versions and use cases are vulnerable to.

To help with this, we present this waterfall chart showing how all vulnerabilities affect which curl versions and we have this complete list of all known security problems since the birth of this project.

Trust[modifier]

For a software to conquer the world, it needs to be trusted. It takes trust to build more trust and it can all be broken down really fast if the foundation is proven to have cracks.

In the curl project we build trust for our users in a few different ways:

1. We are completely transparent about everything. Every decision, every discussion as well as every line of code are always public and done in the open.

2. We try hard to write reliable code. We write test cases, we review code, we document best practices and we have a style guide that helps us keep code consistent.

3. We stick to promises and guarantees as much as possible. We don't break APIs and we don't abandon support for old systems.

4. Security is of utmost importance and we take every reported incident very seriously and realize that we must fix all known problems and we need to do it responsibly. We do our best to not endanger our users.

5. We act like adults. We can be silly and we can joke around, but we do it responsibly and we follow our Code of Conduct . Everyone should be able to even trust us to behave.

The development team[modifier]

Daniel Stenberg is the founder and self-proclaimed leader of the project. Everybody else that participates or contributes in the project has thus arrived at a later point in time. Some contributors worked for a while and then left again. Most contributors hang around only for a short while to get their bug fixed or feature merged or similar. Counting all contributors we know the names of, we have received help from more than 1400 individuals.

The list of people that have repeatedly shown up in discussions and commits during the last several years include these stellar individuals:

Daniel Stenberg Steve Holme Jay Satiro

Dan Fandrich Marc Hörsken Kamil Dudka

Alessandro Ghedini Yang Tse

Günter Knauf

Tatsuhiro Tsujikawa Patrick Monnerat Nick Zitzmann

Users of curl[modifier]

We used to say that there are a billion users of curl. It makes a good line to say but in reality we, of course, don't have any numbers that exact. We just estimate and guess based on observations and trends. It also depends on exactly what you would consider "a user" to be. Let's elaborate.

Open Source[modifier]

The project being Open Source and very liberally licensed means that just about anyone can redistribute curl in source format or built into binary form.

Counting downloads[modifier]

The curl command-line tool and the libcurl library are available for download for most operating systems via the curl web site, they are provided via third party installers to a bunch and and they come installed by default with yet more operating systems. This makes

Users of curl[modifier]

counting downloads from the curl web site completely inappropriate as a means of measurement.

Finding users[modifier]

So, we can't count downloads and anyone may redistribute it and nobody is forced to tell us they use curl. How can we figure out the numbers? How can we figure out the users? The answer is that we really can't with any decent level of accuracy.

Instead we rely on witness reports, circumstantial evidence, on findings on the Internet, the occasional "about box" or license agreement mentioning curl or that authors ask for help and tell us about their use.

The curl license says users need to repeat it somewhere, like in the documentation, but that's not easy for us to find in many cases and it's also not easy for us to do anything about should they decide not to follow the very small license requirement.

Command-line tool users[modifier]

The command-line tool curl is widely used by programmers around the world in shell and batch scripts, to debug servers and to test out things. There's no doubt it is used by millions every day.

Embedded library[modifier]

libcurl is what makes our project reach the really large volume of users. The ability to quickly and easily get client side file transfer abilities into your application is desirable for a lot of users, and then libcurl's great portability also helps: you can write more or less the same application on a wide variety of platforms and you can still keep using libcurl for transfers.

libcurl being written in C with no or just a few required dependencies also help to get it used in embedded systems.

libcurl is popularly used in smartphone operating systems, in car infotainment setups, in television sets, in set-top boxes, in audio and video equipment such as Blu-Ray players and higher-end receivers. It is often used in home routers and printers.

A fair number of best-selling games are also using libcurl, on Windows and game consoles.

In web site backends[modifier]

The libcurl binding for PHP was one of, if not the, first bindings for libcurl to really catch on and get used widely. It quickly got adopted as a default way for PHP users to transfer data and as it has now been in that position for over a decade and PHP has turned out to be a fairly popular technology on the Internet (recent numbers indicated that something like a quarter of all sites on the Internet uses PHP).

A few really high-demand sites are using PHP and are using libcurl in the backend. Facebook and Yahoo are two such sites.

Famous users[modifier]

Nothing forces users to tell us they use curl or libcurl in their services or in the products. We usually only find out they do by accident, by reading about dialogues, documentation and license agreements. Of course some companies also just flat out tell us.

We collect names of companies and products on our web site of users that use the project's products "in commercial environments". We do this mostly just to show-off to other big brands that if these other guys can build products that depend on us, maybe you can, too?

The list of companies are well over 200 names, but extracting some of the larger or more well-known brands, here's a pretty good list that, of course, is only a small selection:

Adobe, Altera, AOL, Apple, AT&T, BBC, Blackberry, BMW, Bosch, Broadcom, Chevrolet, Cisco, Comcast, Facebook, Google, Hitachi, Honeywell, HP, Huawei, HTC, IBM, Intel, LG, Mazda, Mercedes-Benz, Motorola, Netflix, Nintendo, Oracle, Panasonic, Philips, Pioneer, RBS, Samsung, SanDisk, SAP, SAS Institute, SEB, Sharp, Siemens, Sony, Spotify, Sun, Swisscom, Tomtom, Toshiba, VMware, Xilinx, Yahoo, Yamaha

Future[modifier]

There's no slowdown in sight in curl's future, bugs reported, development pace or how Internet protocols are being developed or updated.

We are looking forward to support for more protocols, support for more features within the already supported protocols, and more and better APIs for libcurl to allow users to do transfers even better and faster.

The project casually maintains a TODO file holding a bunch of ideas that we could work on in the future. It also keeps a KNOWN_BUGS document with, a list of known problems we would like to fix.

There's a ROADMAP document that describe some plans for the short-term that some of the active developers thought they would work on next. Of course, we can not promise that we will always follow it perfectly.

We are highly dependent on developers to join in and work on what they want to get done, be it bug fixes or new features.

Open Source[modifier]

What is Open Source[modifier]

Generally, Open Source software is software that can be freely accessed, used, changed, and shared (in modified or unmodified form) by anyone. Open Source software is typically made by many people, and distributed under licenses that comply with the definition.

Free Software is an older and related term that basically says the same thing for all our intents and purposes, but we stick to the term Open Source in this document for simplicity.

License[modifier]

curl and libcurl are distributed under an Open Source license known as a MIT license derivative. It is very short, simple and easy to grasp. It follows here in full:

COPYRIGHT AND PERMISSION NOTICE[modifier]

Copyright (c) 1996 - 2017, Daniel Stenberg, <daniel@haxx.se>.

All rights reserved.

Permission to use, copy, modify, and distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR

OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Except as contained in this notice, the name of a copyright holder shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Software without prior written authorization of the copyright holder.

This is basically legalese that says you are allowed to change the code, redistribute the code, redistribute binaries built from the code and build proprietary code with it, without anyone requiring you to give any changes back to the project—but you may not claim that you wrote it.

Early on in the project we iterated over a few different other licenses before we settled on this. We started out GPL, then tried MPL and landed on this MIT derivative. We will never change the license again.

Copyright[modifier]

Copyright is a legal right granted by the law of a country that gives the creator of an original work exclusive rights for its use and distribution.

The copyright owner(s) can agree to allow others to use their work by licensing it. That's what we do in the curl project. The copyright is the foundation on which the licensing works.

Daniel Stenberg is the owner of most copyrights in the curl project.

Independent[modifier]

A lot of Open Source projects are run within umbrella organizations. Such organizations include the GNU project, the Apache Software Foundation, a larger company that funds the project or similar. The curl project is not part of any such larger organization but is completely independent and free.

No company controls curl's destiny and the curl project does not need to follow any umbrella organization's guidelines.

curl is not a formal company, organization or a legal entity of any kind. curl is just an informal collection of humans, distributed across the globe, who work together on a software project.

Legal[modifier]

The curl project obeys national laws of the countries in which it works. However, it is a highly visible international project, downloadable and usable in effectively every country on earth, so some local laws could be broken when using curl. That's just the nature of it and if uncertain, you should check your own local situation.

There have been law suits involving technology that curl provides. One such case known to the author of this was a patent case in the US that insisted they had the rights to resumed file transfers.

As a generic software component that is usable everywhere to everyone, there are times when libcurl—in particular—is used in nefarious or downright malicious ways. Examples include being used in virus and malware software. That is unfortunate but nothing we can prevent.

Code of conduct[modifier]

As contributors and maintainers of this project, we pledge to respect all people who contribute through reporting issues, posting feature requests, updating documentation, submitting pull requests or patches, and other activities.

We are committed to making participation in this project a harassment-free experience for everyone, regardless of level of experience, gender, gender identity and expression, sexual orientation, disability, personal appearance, body size, race, ethnicity, age, or religion.

Examples of unacceptable behavior by participants include the use of sexual language or imagery, derogatory comments or personal attacks, trolling, public or private harassment, insults, or other unprofessional conduct.

Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct. Project maintainers who do not follow the Code of Conduct may be removed from the project team.

This code of conduct applies both within project spaces and in public spaces when an individual is representing the project or its community.

Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by opening an issue or contacting one or more of the project maintainers.

Development[modifier]

We encourage everyone to participate in the development of curl and libcurl. We appreciate all the help we can get and while the main portion of this project is source code, there is a lot more than just coding and debugging help that is needed and useful.

We develop and discuss everything in the open, preferably on the mailing lists.

Source code on github[modifier]

The source code to curl and libcurl have also been provided and published publicly and it continues to be uploaded to the main web site for every release.

Since March 2010, the curl source code repository has been hosted on github.com . By being up to date with the changes there, you can follow our day to day development very closely.

The source code[modifier]

The source code is, of course, the actual engine parts of this project. After all, it is a software project.

curl and libcurl are written in C.

Hosting and download[modifier]

You can always find the source code for the latest curl and libcurl release on the official curl web site . From there you can also find alternative mirrors that host copies and there are checksums and digital signatures provided to help you verify that what ends up on your local system when you download these files are the same bytes in the same order as were originally uploaded there by the curl team.

If you instead would rather work directly with the curl source code off our source code repository, you find all details in the curl github repository .

Clone the code[modifier]

git clone https://github.com/curl/curl.git

This will get the latest curl code downloaded and unpacked in a directory on your local system.

Code layout[modifier]

The curl source code tree is neither large nor complicated. A key thing to remember is, perhaps, that libcurl is the library and that library is the biggest component of the curl command-line tool.

root[modifier]

We try to keep the number of files in the source tree root to a minimum. You will see a slight difference in files if you check a release archive compared to what is stored in the git repository as several files are generated by the release scripts.

Some of the more notable ones include:

 buildconf  : used to build configure and more when building curl from source 
 out of the git repository.

 buildconf.bat  : the Windows version of buildconf. Run this after having 
 checked out the full source code from git.

 CHANGES  : generated at release and put into the release archive. It contains 
 the 1000 latest changes to the source repository.

 configure  : a generated script that is used on Unix-like systems to generate
 a setup when building curl.

 COPYING  : the license detailing the rules for your using the code.

 GIT-INFO  : only present in git and contains information about how to build
 curl after having checked out the code from git.

 maketgz  : the script used to produce release archives and daily snapshots
 README  : a short summary of what curl and libcurl are.

 RELEASE-NOTES  : contains the changes done for the latest release; when found 
 in git it contains the changes done since the previous release that are 
 destined to end up in the coming release.

lib[modifier]

This directory contains the full source code for libcurl. It is the same source code for all platforms—over one hundred C source files and a few more private header files. The header files used when building applications against libcurl are not stored in this directory; see include/curl for those.

Code layout[modifier]

Depending on what features are enabled in your own build and what functions your platform provides, some of the source files or portions of the source files may contain code that is not used in your particular build.

lib/vtls[modifier]

The VTLS sub section within libcurl is the home of all the TLS backends libcurl can be built to support. The "virtual" TLS internal API is a common API that is used within libcurl to access TLS and crypto functions without the main code knowing exactly which TLS library that is used. This allows the person who builds libcurl to select from a wide variety TLS libraries to build with.

We also maintain a SSL comparison table on the web site to aid users.

OpenSSL: the (by far) most popular TLS library.

BoringSSL: an OpenSSL fork maintained by Google. It will make libcurl disable a few features due to lacking some functionality in the library.

LibreSSL: an OpenSSL fork maintained by the OpenBSD team.

NSS: a full-blown TLS library perhaps most known for being used by the Firefox web browser. This is the default TLS backend for curl on Fedora and Redhat systems. GnuTLS: a full-blown TLS library used by default by the Debian packaged curl. mbedTLS: (formerly known as PolarSSL) is a TLS library more targeted towards the embedded market.

WolfSSL: (formerly known as cyaSSL) is a TLS library more targeted towards the embedded market.

axTLS: a minuscule TLS library focused on a requiring a small footprint. SChannel: the native TLS library on Windows.

SecureTransport: the native TLS library on Mac OS X. GSKit: the native TLS library on OS/400.

src[modifier]

This directory holds the source code for the curl command-line tool. It is the same source code for all platforms that run the tool.

Most of what the command-line tool does is to convert given command line options into the corresponding libcurl options or set of options and then makes sure to issue them correctly to drive the network transfer according to the user's wishes.

This code uses libcurl just as any other application would.

include/curl[modifier]

Code layout[modifier]

Here are the public header files that are provided for libcurl-using applications. Some of them are generated at configure or release time so they do not look identical in the git repository as they do in a release archive.

With modern libcurl, all an application is expected to include in its C source code is #include

<curl/curl.h>

docs[modifier]

The main documentation location. Text files in this directory are typically plain text files. We have slowly started to move towards Markdown format so a few (but hopefully growing number of) files use the .md extension to signify that.

Most of these documents are also shown on the curl web site automatically converted from text to a web friendly format/look.

BINDINGS  : lists all known libcurl language bindings and where to find them  BUGS  : how to report bugs and where
CODE_OF_CONDUCT.md  : how we expect people to behave in this project  CONTRIBUTE  : what to think about when contributing to the project  curl.1  : the curl command-line tool man page, in nroff format  curl-config.1  : the curl-config man page, in nroff format
FAQ  : frequently asked questions about various curl-related subjects  FEATURES  : an incomplete list of curl features
HISTORY  : describes how the project started and has evolved over the years  HTTP2.md  : how to use HTTP/2 with curl and libcurl
HTTP-COOKIES  : how curl supports and works with HTTP cookies  index.html  : a basic HTML page as a documentation index page  INSTALL  : how to build and install curl and libcurl from source  INSTALL.cmake  : how to build curl and libcurl with CMake  INSTALL.devcpp  : how to build curl and libcurl with devcpp  INTERNALS  : details curl and libcurl internal structures  KNOWN_BUGS  : list of known bugs and problems
LICENSE-MIXING  : describes how to combine different third party modules and their

individual licenses[modifier]

MAIL-ETIQUETTE  : this is how to communicate on our mailing lists  MANUAL  : a tutorial-like guide on how to use curl
mk-ca-bundle.1  : the mk-ca-bundle tool man page, in nroff format  README.cmake  : CMake-specific details  README.netware  : Netware-specific details  README.win32  : win32-specific details
RELEASE-PROCEDURE  : how to do a curl and libcurl release
RESOURCES  : further resources for further reading on what, why and how curl does things  ROADMAP.md  : what we want to work on in the future  SECURITY  : how we work on security vulnerabilities  SSLCERTS  : TLS certificate handling documented
SSL-PROBLEMS  : common SSL problems and their causes
THANKS  : thanks to this extensive list of friendly people, curl exists today!  TheArtOfHttpScripting  : a tutorial into HTTP scripting with curl  TODO  : things we or you can work on implementing  VERSIONS  : how the version numbering of libcurl works

docs/libcurl[modifier]

All libcurl functions have their own man pages in individual files with .3 extensions, using nroff format, in this directory. The are also a few other files that are described below.

 ABI 
 index.html 
 libcurl.3 
 libcurl-easy.3 
 libcurl-errors.3 
 libcurl.m4 
 libcurl-multi.3 
 libcurl-share.3 
 libcurl-thread.3 
 libcurl-tutorial.3 
 symbols-in-versions

docs/libcurl/opts[modifier]

This directory contains the man pages for the individual options for three different libcurl functions.

curl_easy_setopt()   options start with   CURLOPT_  ,   curl_multi_setopt()   options start with  CURLMOPT_   and   curl_easy_getinfo()   options start with   CURLINFO_  .

docs/examples[modifier]

Contains around 100 stand-alone examples that are meant to help readers understand how libcurl can be used.

See also the libcurl examples section of this book.

Handy scripts.[modifier]

contributors.sh  : extracts all contributors from the git repository since a given hash/tag. The purpose is to generate a list for the RELEASE-NOTES file and to allow manually added names to remain in there even on updates. The script uses the 'THANKS-filter` file to rewrite some names.
contrithanks.sh  : extracts contributors from the git repository since a given hash/tag, filters out all the names that are already mentioned in   THANKS  , and then outputs  THANKS   to stdout with the list of new contributors appended at the end; it's meant to allow easier updates of the THANKS document. The script uses the 'THANKS-filter` file to rewrite some names.
log2changes.pl  : generates the   CHANGES   file for releases, as used by the release script.

It simply converts git log output.

zsh.pl  : helper script to provide curl command-line completions to users of the zsh

shell.

====!Handling different build options====!

The curl and libcurl source code have been carefully written to build and run on virtually every computer platform in existence. This can only be done through hard work and by adhering to a few guidelines (and, of course, a fair amount of testing).

A golden rule is to always add #ifdefs that checks for specific features, and then have the setup scripts (configure or CMake or hard-coded) check for the presence of said features in a user's computer setup before the program is compiled there. Additionally and as a bonus, thanks to this way of writing the code, some features can be explicitly turned off even if they are present in the system and could be used. Examples of that would be when users want to, for example, build a version of the library with a smaller footprint or with support for certain protocols disabled, etc.

The project sometimes uses #ifdef protection around entire source files when, for example, a single file is provided for a specific operating system or perhaps for a specific feature that isn't always present. This is to make it possible for all platforms to always build all files—it simplifies the build scripts and makefiles a lot. A file entirely #ifdefed out hardly adds anything to the build time, anyway.

Rather than sprinkling the code with #ifdefs, to the extent where it is possible, we provide functions and macros that make the code look and work the same, independent of present features. Some of those are then empty macros for the builds that lack the features.

Both TLS handling and name resolving are handled with an internal API that hides the specific implementation and choice of 3rd party software library. That way, most of the internals work the same independent of which TLS library or name resolving system libcurl is told to use.

Style and code requirements[modifier]

Source code that has a common style is easier to read than code that uses different styles in different places. It helps making the code feel like one continuous code base. Easy-to-read is a very important property of code and helps make it easier to review when new things are added and it helps debugging code when developers are trying to figure out why things go wrong. A unified style is more important than individual contributors having their own personal tastes satisfied.

Our C code has a few style rules. Most of them are verified and upheld by the lib/checksrc.pl script. Invoked with make checksrc or even by default by the build system when built after

./configure --enable-debug   has been used.

It is normally not a problem for anyone to follow the guidelines as you just need to copy the style already used in the source code, and there are no particularly unusual rules in our set of rules.

We also work hard on writing code that is warning-free on all the major platforms and in general on as many platforms as possible. Code that obviously will cause warnings will not be accepted as-is.

Some the rules that you won't be allowed to break are:

Indentation[modifier]

We use only spaces for indentation, never TABs. We use two spaces for each new open brace.

Comments[modifier]

Since we write C89 code, // aren't allowed. They weren't introduced in the C standard until C99. We use only /* and */ comments:

/* this is a comment */

Long lines[modifier]

Source code in curl may never be wider than 80 columns. There are two reasons for maintaining this even in the modern era of very large and high resolution screens:

Code style[modifier]

1. Narrower columns are easier to read than very wide ones. There's a reason

newspapers have used columns for decades or centuries.

2. Narrower columns allow developers to more easily view multiple pieces of code next to each other in different windows. I often have two or three source code windows next to each other on the same screen, as well as multiple terminal and debugging windows.

Open brace on the same line[modifier]

In if/while/do/for expressions, we write the open brace on the same line as the keyword and we then set the closing brace on the same indentation level as the initial keyword. Like this:

if(age < 40) {

 /* clearly a youngster */ }

else on the following line[modifier]

When adding an else clause to a conditional expression using braces, we add it on a new line after the closing brace. Like this:

if(age < 40) {

 /* clearly a youngster */ }

else {

 /* probably intelligent */ }

No space before parentheses

When writing expressions using if/while/do/for, there shall be no space between the keyword and the open parenthesis. Like this:

while(1) {

 /* loop forever */ }

Contributing[modifier]

Contributing means helping out.

When you contribute anything to the project—code, documentation, bug fixes, suggestions or just good advice—we assume you do this with permission and you are not breaking any contracts or laws by providing that to us. If you don't have permission, don't contribute it to us.

Contributing to a project like curl could be many different things. While source code is the stuff that is needed to build the products, we are also depending on good documentation, testing (both test code and test infrastructure), web content, user support and more.

Send your changes or suggestions to the team and by working together we can fix problems, improve functionality, clarify documentation, add features or make anything else you help out with land in the proper place. We will make sure improved code and docs get merged into the source tree properly and other sorts of contributions are suitable received.

Send your contributions on a mailing list , file an issue or submit a pull request.

Suggestions[modifier]

Ideas are easy, implementations are hard. Yes, we do appreciate good ideas and

suggestions of what to do and how to do it, but the chances that the ideas actually turn into real features grow substantially if you also volunteer to participate in converting the idea into reality.

We already gather ideas in the TODO document and we are generally aware of the current trends in the popular networking protocols so there is usually no need to remind us about those.

What to add[modifier]

The best approach to add anything to curl or libcurl is, of course, to first bring the idea and suggestion to the curl project team members and then discuss with them if the idea is feasible for inclusion and then how an implementation is best done—and done in the best possible way to get merged into the source code repository, assuming that is what you want.

The project generally approves of functions that improves the support for the current protocols, especially features that popular clients or browsers have but that curl still lacks.

Contributing[modifier]

Of course, you can also add contents to the project that isn't code, like documentation, graphics or web site contents, but the general rules apply equally to that.

If you are fixing a problem you have or a problem that others are reporting, we will be thrilled to receive your fix and merge it as soon as possible!

What not to add[modifier]

There aren't any good rules to say what features you can't add or that we will never accept, but let me instead try to mention a few things you should avoid to get less friction and to be successful, faster:

Do not write up a huge patch first and then send it to the list for discussion. Always start out by discussing on the list, and send your initial review requests early to get feedback on your design and approach. It saves you from wasting time going down a route that might need rewriting in the end anyway!

When introducing things in the code, you need to follow the style and architecture that already exists. When you add code to the ordinary transfer code path, it must, for example, work asynchronously in a non-blocking manner. We will not accept new code that introduces blocking behaviors—we already have too many of those that we haven't managed to remove yet.

Quick hacks or dirty solutions that have a high risk of not working on platforms you don't run or on architectures you don't know. We don't care if you are in a hurry or that it works for you. We do not accept high risk code or code that is hard to read or understand.

Code that breaks the build. Sure, we accept that we sometimes have to add code to certain areas that makes the new functionality perhaps depend on a specific 3rd party library or a specific operating system and similar, but we can never do that at the expense of all other systems. We don't break the build, and we make sure all tests keep running successfully.

git[modifier]

Our preferred source control tool is git .

While git is sometimes not the easiest tool to learn and master, all the basic steps a casual developer and contributor needs to know are very straight-forward and do not take much time or effort to learn.

Contributing[modifier]

This book will not help you learn git. All software developers in this day and age should learn git anyway.

The curl git tree can be browsed with a web browser on our github page at https://github.com/curl/curl .

To check out the curl source code from git, you can clone it like this:

git clone https://github.com/curl/curl.git

Pull request[modifier]

A very popular and convenient way to make your own changes and contribute them back to the project is by doing a so-called pull request on github.

First, you create your own version of the source tree, called a fork, on the github web site. That way you get your own version of the curl git tree that you can clone to a local copy.

You edit your own local copy, commit the changes, push them to the git repository on github and then on the github web site you can select to create a pull request based on your changes done to your local repository clone of the original curl repository.

We recommend doing your work meant for a pull request in a dedicated separate branch and not in master, just to make it easier for you to update a pull request, like after review, for example, or if you realize it was a dead end and you decide to just throw it away.

Make a patch for the mailing list[modifier]

Even if you opt to not make a pull request but prefer the old fashioned and trusted method of sending a patch to the curl-library mailing list, it is still a good to work in a local git branch and commit your changes there.

A branch makes it easy to edit and rebase when you need to change things and it makes it easy to keep syncing to the master branch when things are updated upstream.

Once your commits are fine enough to get sent to the mailing list, you just create patches with git format-patch and send them away. Even more fancy users go directly to git send-email and have git send the e-mail itself!

git commit style[modifier]

When you commit a patch to git, you give it a commit message that describes the change you are committing. We have a certain style in the project that we ask you to use:

Contributing[modifier]

[area]: [short line describing the main effect]

[separate the above single line from the rest with an empty line]

[full description, no wider than 72 columns that describes as much as possible as to why this change is made, and possibly what things it fixes and everything else that is related]

[Bug: link to source of the report or more related discussion] [Reported-by: John Doe—credit the reporter]

[whatever-else-by: credit all helpers, finders, doers]

Don't forget to use git commit --author="Jane Doe <jane@example.com>" if you commit someone else's work, and make sure that you have your own user and e-mail setup correctly in git before you commit!

The author and the *-by: lines are, of course, there to make sure we give the proper credit in the project. We do not want to take someone else's work without clearly attributing where it comes from. Giving correct credit is of utmost importance!

Who decides what goes in?[modifier]

First, it might not be obvious to everyone but there is, of course, only a limited set of people that can actually merge commits into the actual official git repository. Let's call them the core team.

Everyone else can fork off their own curl repository to which they can commit and push changes and host them online and build their own curl versions from and so on, but in order to get changes into the official repository they need to be pushed by a trusted person.

The core team is a small set of curl developers who have been around for a several years and that have shown that they are skilled developers and that they fully comprehend the values and the style of development we do in this project. They are some of the people listed in the The development team section.

You can always bring a discussion to the mailing list and motivation why you think your changes should get accepted, or perhaps even object to other changes that are getting in and so forth. You can even suggest yourself or someone else to be given "push rights" and become one of the selected few in that team.

Daniel remains the project leader and while it is very rarely needed, he has the final say in debates that don't seem to sway in either direction or fail to reach some sort of consensus.

Contributing

Reporting vulnerabilities[modifier]

All known and public curl or libcurl related vulnerabilities are listed on the curl web site security page .

Security vulnerabilities should not be entered in the project's public bug tracker unless the necessary configuration is in place to limit access to the issue to only the reporter and the project's security team.

Vulnerability handling[modifier]

The typical process for handling a new security vulnerability is as follows.

No information should be made public about a vulnerability until it is formally announced at the end of this process. That means, for example, that a bug tracker entry must NOT be created to track the issue since that will make the issue public and it should not be discussed on any of the project's public mailing lists. Also messages associated with any commits should not make any reference to the security nature of the commit if done prior to the public announcement.

The person discovering the issue, the reporter, reports the vulnerability privately to curl-security@haxx.se . That's an e-mail alias that reaches a handful of selected and

trusted people.[modifier]

Messages that do not relate to the reporting or managing of an undisclosed security vulnerability in curl or libcurl are ignored and no further action is required.

A person in the security team sends an e-mail to the original reporter to acknowledge the report.

The security team investigates the report and either rejects it or accepts it.

If the report is rejected, the team writes to the reporter to explain why.

If the report is accepted, the team writes to the reporter to let him/her know it is accepted and that they are working on a fix.

The security team discusses the problem, works out a fix, considers the impact of the problem and suggests a release schedule. This discussion should involve the reporter as much as possible.

Reporting vulnerabilities[modifier]

The release of the information should be "as soon as possible" and is most often synced with an upcoming release that contains the fix. If the reporter, or anyone else, thinks the next planned release is too far away then a separate earlier release for security reasons should be considered.

Write a security advisory draft about the problem that explains what the problem is, its impact, which versions it affects, any solutions or workarounds and when the fix was released, making sure to credit all contributors properly.

Request a CVE number from distros@openwall when also informing and preparing them for the upcoming public security vulnerability announcement—attach the advisory draft for information. Note that 'distros' won't accept an embargo longer than 19 days.

Update the "security advisory" with the CVE number.

The security team commits the fix in a private branch. The commit message should ideally contain the CVE number. This fix is usually also distributed to the 'distros' mailing list to allow them to use the fix prior to the public announcement.

At the day of the next release, the private branch is merged into the master branch and pushed. Once pushed, the information is accessible to the public and the actual release should follow suit immediately afterwards.

The project team creates a release that includes the fix.

The project team announces the release and the vulnerability to the world in the same manner we always announce releases—it gets sent to the curl-announce, curl-library and curl-users mailing lists.

The security web page on the web site should get the new vulnerability mentioned.

curl-security@haxx.se[modifier]

Who is on this list? There are a couple of criteria you must meet, and then we might ask you to join the list or you can ask to join it. It really isn't very formal. We basically only require that you have a long-term presence in the curl project and you have shown an understanding for the project and its way of working. You must have been around for a good while and you should have no plans on vanishing in the near future.

We do not make the list of participants public mostly because it tends to vary somewhat over time and a list somewhere will only risk getting outdated.

Web site[modifier]

Web site source code[modifier]

Most of the curl web site is also available in a public git repository, although separate from the source code repository since it generally isn't interesting to the same people and we can maintain a different list of people that have push rights, etc.

The web site git repository is available on github at this URL: https://github.com/curl/curl- www and you can clone a copy of the web code like this:

git clone https://github.com/curl/curl-www.git

Building the web[modifier]

The web site is an old custom-made setup that mostly builds static HTML files from a set of source files. The sources files are preprocessed with what is basically a souped-up C preprocessor called fcpp and a set of perl scripts. The man pages get converted to HTML with roffit . Make sure fcpp, perl, roffit, make and curl are all in your $PATH.

Once you have cloned the git repository the first time, invoke sh bootstrap.sh once to get a symlink and some some initial local files setup, and then you can build the web site locally by invoking make in the source root tree.

Note that this doesn't make you a complete web site mirror, as some scripts and files are only available on the real actual site, but should give you enough to let you view most HTML pages locally.

Network and protocols[modifier]

Before diving in and talking about how to use curl to get things done, let's take a look at what all this networking is and how it works, using simplifications and some minor shortcuts to give an easy overview.

The basics are in the networking simplified chapter that tries to just draw a simple picture of what networking is from a curl perspective, and the protocols section which explains what exactly a "protocol" is and how that works.

Networking simplified[modifier]

Networking means communicating between two endpoints on the Internet. The Internet is just a bunch of interconnected machines (computers really), each using their own private addresses (called IP addresses). The addresses each machine have can be of different types and machines can even have temporary addresses. These computers are often called hosts.

The computer, tablet or phone you sit in front of is usually called "the client" and the machine out there somewhere that you want to exchange data with is called "the server". The main difference between the client and the server is in the roles they play here. There's nothing that prevents the roles from being reversed in a subsequent operation.

Which machine[modifier]

When you want to initiate a transfer to one of the machines out there (a server), you usually don't know its IP addresses but instead you usually know its name. The name of the machine you will talk to is embedded in the URL that you work with when you use curl.

You might use a URL like " http://example.com/index.html ", which means you will connect to and communicate the host named example.com.

Host name resolving[modifier]

Once we know the host name, we need to figure out which IP addresses that host has so that we can contact it.

Converting the name to an IP address is often called 'name resolving'. The name is "resolved" to a set of addresses. This is usually done by a "DNS server", DNS being like a big lookup table that can convert names to addresses—all the names on the Internet, really. Your computer normally already knows the address of a computer that runs the DNS server as that is part of setting up the network.

curl will therefore ask the DNS server: "Hello, please give me all the addresses for example.com", and the server responds with a list of them. Or in the case you spell the name wrong, it can answer back that the name doesn't exist.

Establish a connection

Networking simplified[modifier]

With a list of IP addresses for the host curl wants to contact, curl sends out a "connect request". The connection curl wants to establish is called TCP and it works sort of like connecting an invisible string between two computers. Once established, it can be used to send a stream of data in both directions.

As curl gets a list of addresses for the host, it will actually traverse that list of addresses when connecting and in case one fails it will try to connect to the next one until either one works or they all fail.

Connects to "port numbers"[modifier]

When connecting with TCP to a remote server, a client selects which port number to do that on. A port number is just a dedicated place for a particular service, which allows that same server to listen to other services on other port numbers at the same time.

Most common protocols have default port numbers that clients and servers use. For example, when using the " http://example.com/index.html " URL, that URL specifies a scheme called "http" which tells the client that it should try TCP port number 80 on the server by default. The URL can optionally provide another, custom, port number but if nothing special is specified, it will use the default port for the scheme used in the URL.

TLS[modifier]

After the TCP connection has been established, many transfers will require that both sides negotiate a better security level before continuing, and that is often TLS; Transport Layer Security. If that is used, the client and server will do a TLS handshake first and only continue further if that succeeds.

Transfer data[modifier]

When the connecting "string" we call TCP is attached to the remote computer (and we have done the possible additional TLS handshake), there's an established connection between the two machines and that connection can then be used to exchange data. That communication is done using a "protocol", as discussed in the following chapter.

Protocol[modifier]

The language used to ask for data to get sent—in either direction—is called the protocol . The protocol describes exactly how to ask the server for data, or to tell the server that there is data coming.

Protocols are typically defined by the IETF ( Internet Engineering Task Force ), which hosts RFC documents that describe exactly how each protocol works: how clients and servers are supposed to act and what to send and so on.

What protocols does curl support?[modifier]

curl supports protocols that allow "data transfers" in either or both directions. We usually also restrict ourselves to protocols which have a "URI format" described in an RFC or at least is somewhat widely used, as curl works primarily with URLs (URIs really) as the input key that specifies the transfer.

The latest curl (as of this writing) supports these protocols:

DICT, FILE, FTP, FTPS, GOPHER, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET, TFTP

To complicate matters further, the protocols often exist in different versions or flavors as well.

What other protocols are there?[modifier]

The world is full of protocols, both old and new. Old protocols get abandoned and dropped and new ones get introduced. There's never a state of stability but the situation changes from day to day and year to year. You can rest assured that there will be new protocols added in the list above in the future and that there will be new versions of the protocols already listed.

There are, of course, already other protocols in existence that curl doesn't yet support. We are open to supporting more protocols that suit the general curl paradigms, we just need developers to write the necessary code adjustments for them.

How are protocols developed?[modifier]

Protocols[modifier]

Both new versions of existing protocols and entirely new protocols are usually developed by persons or teams that feel that the existing ones are not good enough. Something about them makes them not suitable for a particular use case or perhaps some new idea has popped up that could be applied to improve things.

Of course, nothing prevents anyone from developing a protocol entirely on their own at their own pleasure in their own backyard, but the major protocols are usually brought to the IETF at a fairly early stage where they are then discussed, refined, debated and polished and then eventually, hopefully, turned into a published RFC document.

Software developers then read the RFC specifications and deploy their code in the world based on their interpretations of the words in those documents. It sometimes turn out that some of the specifications are subject to vastly different interpretations or sometimes the engineers are just lazy and ignore sound advice in the specs and deploy something that doesn't adhere. Writing software that interoperates with other implementations of the specifications can therefore end up being hard work.

How much do protocols change?[modifier]

Like software, protocol specifications are frequently updated and new protocol versions are created.

Most protocols allow some level of extensibility which makes new extensions show up over time, extensions that make sense to support.

The interpretation of a protocol sometimes changes even if the spec remains the same.

The protocols mentioned in this chapter are all "Application Protocols", which means they are transferred over more lower level protocols, like TCP, UDP and TLS. They are also themselves protocols that change over time, get new features and get attacked so that new ways of handling security, etc., forces curl to adapt and change.

About adhering to standards and who's right[modifier]

Generally, there are protocol specs that tell us how to send and receive data for specific protocols. The protocol specs we follow are RFCs put together and published by IETF.

Some protocols are not properly documented in a final RFC, like, for example, SFTP for which our implementation is based on an Internet-draft that isn't even the last available one.

Protocols[modifier]

Protocols are, however, spoken by two parties and like in any given conversation, there are then two sides of understanding something or interpreting the given instructions in a spec. Also, lots of network software is written without the authors paying very close attention to the spec so they end up taking some shortcuts, or perhaps they just interpreted the text differently. Sometimes even mistakes and bugs make software behave in ways that are not mandated by the spec and sometimes even downright forbidden in the specs.

In the curl project we use the published specs as rules on how to act until we learn anything else. If popular alternative implementations act differently than what we think the spec says and that alternative behavior is what works widely on the big Internet, then chances are we will change foot and instead decide to act like those others. If a server refuses to talk with us when we think we follow the spec but works fine when we bend the rules every so slightly, then we probably end up bending them exactly that way—if we can still work successfully with other implementations.

Ultimately, it is a personal decision and up for discussion in every case where we think a spec and the real world don't align.

In the worst cases we introduce options to let application developers and curl users have the final say on what curl should do. I say worst because it is often really tough to ask users to make these decisions as it usually involves very tricky details and weirdness going on and it is a lot to ask of users. We should always do our very best to avoid pushing such protocol decisions to users.

The protocols curl supports[modifier]

curl supports about 22 protocols. We say "about" because it depends on how you count and what you consider to be distinctly different protocols.

DICT

DICT is a dictionary network protocol, it allows clients to ask dictionary servers about a meaning or explanation for words. See RFC 2229. Dict servers and clients use TCP port 2628.

FILE

FILE is not actually a "network" protocol. It is a URL scheme that allows you to tell curl to get a file from the local file system instead of getting it over the network from a remote server. See RFC 1738.

FTP

FTP stands for File Transfer Protocol and is an old (originates in the early 1970s) way to transfer files back and forth between a client and a server. See RFC 959. It has been extended muchly over the years. FTP servers and clients use TCP port 21 plus one more port, though the second one is usually dynamicly established during communication.

See the external page FTP vs HTTP for how it differs to HTTP.

FTPS

FTPS stands for Secure File Transfer Protocol. It follows the tradition of appending an 'S' to the protocol name to signify that the protocol is done like normal FTP but with an added SSL/TLS security layer. See RFC 4217.

This protocol is very problematic to use through firewalls and other network equipments.

GOPHER

Designed for "distributing, searching, and retrieving documents over the Internet", Gopher is somewhat of the grand father to HTTP as HTTP has mostly taken over completely for the same use cases. See RFC 1436. Gopher servers and clients use TCP port 70.

HTTP

The Hypertext Transfer Protocol, HTTP, is the most widely used protocol for transferring data on the web and over the Internet. See RFC 7230 for HTTP/1.1 and RFC 7540 for HTTP/2, the successor. HTTP servers and clients use TCP port 80.

HTTPS

Secure HTTP is HTTP done over an SSL/TLS connection. See RFC 2818. HTTPS servers and clients use TCP port 443.

IMAP

The Internet Message Access Protocol, IMAP, is a protocol for accessing, controlling and "reading" email. See RFC 3501. IMAP servers and clients use TCP port 143.

IMAPS

Secure IMAP is IMAP done over an SSL/TLS connection. Such connections usually start out as a "normal" IMAP connection that is then upgraded to IMAPS using the STARTTLS command.

LDAP

The Lightweight Directory Access Protocol, LDAP, is a protocol for accessing and maintaining distributed directory information. Basically a database lookup. See RFC 4511. LDAP servers and clients use TCP port 389.

LDAPS

Secure LDAP is LDAP done over an SSL/TLS connection.

POP3

The Post Office Protocol version 3 (POP3) is a protocol for retrieving email from a server. See RFC 1939. POP3 servers and clients use TCP port 110.

POP3S

Secure POP3 is POP3 done over an SSL/TLS connection. Such connections usually start out as a "normal" POP3 connection that is then upgraded to POP3S using the STARTTLS command.

RTMP

The Real-Time Messaging Protocol (RTMP) is a protocol for streaming audio, video and data. RTMP servers and clients use TCP port 1935.

RTSP

The Real Time Streaming Protocol (RTSP) is a network control protocol to control streaming media servers. See RFC 2326. RTSP servers and clients use TCP and UDP port 554.

SCP

The Secure Copy (SCP) protocol is designed to copy files to and from a remote SSH server. SCP servers and clients use TCP port 22.

SFTP

The SSH File Transfer Protocol (SFTP) that provides file access, file transfer, and file management over a reliable data stream. SFTP servers and clients use TCP port 22.

SMB

The Server Message Block (SMB) protocol is also known as CIFS. It is a an application- layer network protocol mainly used for providing shared access to files, printers, and serial ports and miscellaneous communications between nodes on a network. SMB servers and

clients use TCP port 485.

SMTP

The Simple Mail Transfer Protocol (SMTP) is a protocol for email transmission. See RFC 821. SMTP servers and clients use TCP port 25.

SMTPS

Secure SMTP is SMTP done over an SSL/TLS connection. Such connections usually start out as a "normal" SMTP connection that is then upgraded to SMTPS using the STARTTLS command.

TELNET

TELNET is an application layer protocol used over networks to provide a bidirectional interactive text-oriented communication facility using a virtual terminal connection. See RFC 854. TELNET servers and clients use TCP port 23.

TFTP

The Trivial File Transfer Protocol (TFTP) is a protocol for doing simple file transfers over UDP to get a file from or put a file onto a remote host. TFTP servers and clients use UDP port 69.

Command line basics[modifier]

curl started out as a command-line tool and it has been invoked from shell prompts and from within scripts by thousands of users over the years. curl has established itself as one of those trusty tools that is there for you to help you get your work done.

Binaries and different platforms[modifier]

The command-line tool "curl" is a binary executable file. The curl project does not by itself distribute or provide binaries. Binary files are highly system specific and oftentimes also bound to specific system versions.

To get a curl for your platform and your system, you need to get a curl executable from somewhere. Many people build their own from the source code provided by the curl project, lots of people install it using a package tool for their operating system and yet another portion of users download binary install packages from sources they trust.

No matter how you do it, make sure you are getting your version from a trusted source and that you verify digital signatures or the authenticity of the packages in other ways.

Also, remember that curl is often built to use third-party libraries to perform and unless curl is built to use them statically you must also have those third-party libraries installed; the exact set of libraries will vary depending on the particular build you get.

Command lines, quotes and aliases[modifier]

There are many different command lines, shells and prompts in which curl can be used. They all come with their own sets of limitations, rules and guidelines to follow. The curl tool is designed to work with any of them without causing troubles but there may be times when your specific command line system doesn't match what others use or what is otherwise documented.

One way that command-line systems differ, for example, is how you can put quotes around arguments such as to embed spaces or special symbols. In most Unix-like shells you use double quotes (") and single quotes (') depending if you want to allow variable expansions or not within the quoted string, but on Windows there's no support for the single quote version.

In some environments, like PowerShell on Windows, the authors of the command line system decided they know better and "help" the user to use another tool instead of curl when curl is typed, by providing an alias that takes precedence when a command line is executed. In order to use curl properly with PowerShell, you need to type in its full name including the extension: "curl.exe".

Different command-line environments will also have different maximum command line lengths and force the users to limit how large amount of data that can be put into a single line. curl adapts to this by offering a way to provide command-line options through a file—or from stdin—using the -K option.

Garbage in, garbage out[modifier]

curl has very little will of its own. It tries to please you and your wishes to a very large extent. It also means that it will try to play with what you give it. If you misspell an option, it might do something unintended. If you pass in a slightly illegal URL, chances are curl will still deal with it and proceed. It means that you can pass in crazy data in some options and you can have curl pass on that crazy data in its transfer operation.

This is a design choice, as it allows you to really tweak how curl does its protocol communications and you can have curl massage your server implementations in the most creative ways.

Command line options[modifier]

When telling curl to do something, you invoke curl with zero, one or several command-line options to accompany the URL or set of URLs you want the transfer to be about. curl supports over two hundred different options.

Short options[modifier]

Command line options pass on information to curl about how you want it to behave. Like you can ask curl to switch on verbose mode with the -v option:

curl -v http://example.com

-v is here used as a "short option". You write those with the minus symbol and a single letter immediately following it. Many options are just switches that switches something on or changes something between two known states. They can be used with just th

curl -vL http://example.com

The command-line parser in curl always parses the entire line and you can put the options anywhere you like; they can also appear after the URL:

curl http://example.com -Lv

Long options[modifier]

Single-letter options are convenient since they are quick to write and use, but as there are only a limited number of letters in the alphabet and there are many things to control, not all options are available like that. Long option names are therefore provided for those. Also, as a convenience and to allow scripts to become more readable, most short options have longer name aliases.

Long options are always written with two minuses (or dashes , whichever you prefer to call them) and then the name and you can only write one option name per double-minus. Asking for verbose mode using the long option format looks like:

Command line options[modifier]

curl --verbose http://example.com

and asking for HTTP redirects as well using the long format looks like:

curl --verbose --location http://example.com

Arguments to options[modifier]

Not all options are just simple boolean flags that enable or disable features. For some of them you need to pass on data, like perhaps a user name or a path to a file. You do this by writing first the option and then the argument, separated with a space. Like, for example, if you want to send an arbitrary string of data in an HTTP POST to a server:

curl -d arbitrary http://example.com

and it works the same way even if you use the long form of the option:

curl --data arbitrary http://example.com

When you use the short options with arguments, you can, in fact, also write the data without the space separator:

curl -darbitrary http://example.com

Arguments with spaces[modifier]

At times you want to pass on an argument to an option, and that argument contains one or more spaces. For example you want to set the user-agent field curl uses to be exactly I am your father , including those three spaces. Then you need to put quotes around the string when you pass it to curl on the command line. The exact quotes to use varies depending on your shell/command prompt, but generally it will work with double quotes in most places:

curl -A "I am your father" http://example.com

Failing to use quotes, like if you would write the command line like this:

curl -A I am your father http://example.com

Command line options[modifier]

… will make curl only use 'I' as a user-agent string, and the following strings, 'am', your, etc will instead all be treated as separate URLs since they don't start with - to indicate that they're options and curl only ever handles options and URLs.

To make the string itself contain double quotes, which is common when you for example want to send a string of JSON to the server, you may need to use single quotes (except on Windows, where single quotes doesn't work the same way). Send the JSON string { "name": "Darth" }  :

curl -d '{ "name": "Darth" }' http://example.com

Or if you want to avoid the single quote thing, you may prefer to send the data to curl via a file, which then doesn't need the extra quoting. Assuming we call the file 'json' that contains the above mentioned data:

curl -d @json http://example.com

Negative options[modifier]

For options that switch on something, there is also a way to switch it off. You then use the long form of the option with an initial "no-" prefix before the name. As an example, to switch off verbose mode:

curl --no-verbose http://example.com

Options depend on version[modifier]

curl   was first typed on a command line back in the glorious year of 1998. It already then worked on the specified URL and none, one or more command-line options given to it.

Since then we have added more options. We add options as we go along and almost every new release of curl has one or a few new options that allow users to modify certain aspects of its operation.

With the curl project's rather speedy release chain with a new release shipping every eight weeks, it is almost inevitable that you are at least not always using the very latest released version of curl. Sometimes you may even use a curl version that is a few years old.

All command-line options described in this book were, of course, added to curl at some point in time, and only a very small portion of them were available that fine spring day in 1998 when curl first shipped. You may have reason to check your version of curl and crosscheck with the curl man page for when certain options were added. This is especially important if you want to take a curl command line using a modern curl version back to an older system that might be running an older installation.

The developers of curl are working hard to not change existing behavior though. Command lines written to use curl in 1998, 2003 or 2010 should all be possible to run unmodified even today.

URLs[modifier]

curl is called curl because a substring in its name is URL (Uniform Resource Locator). It operates on URLs. URL is the name we casually use for the web address strings, like the ones we usually see prefixed with http:// or starting with www.

URL is, strictly speaking, the former name for these. URI (Uniform Resource Identifier) is the more modern and correct name for them. Their syntax is defined in RFC 3986 .

Where curl accepts a "URL" as input, it is then really a "URI". Most of the protocols curl understands also have a corresponding URI syntax document that describes how that particular URI format works.

curl assumes that you give it a valid URL and it only does limited checks of the format in order to extract the information it deems necessary to perform its operation. You can, for example, most probably pass in illegal characters in the URL without curl noticing or caring and it will just pass them on.

Scheme[modifier]

URLs start with the "scheme", which is the official name for the "http://" part. That tells which protocol the URL uses. The scheme must be a known one that this version of curl supports or it will show an error message and stop. Additionally, the scheme must neither start with nor contain any whitespace.

The scheme separator[modifier]

The scheme identifier is separated from the rest of the URL by the "://" sequence. That is a colon and two forward slashes. There exists URL formats with only one slash, but curl doesn't support any of them. There are two additional notes to be aware of, about the number of slashes:

curl allow some illegal syntax and try to correct it internally; so it will also understand and accept URLs with one or three slashes, even though they are in fact not properly formed URLs. curl does this because the browsers started this practice so it has lead to such URLs being used in the wild every now and then.

file:// URLs are written as file://<hostname>/<path> but the only hostnames that are

okay to use are localhost , 127.0.0.1 or a blank (nothing at all):

URLs[modifier]

file://localhost/path/to/file file://127.0.0.1/path/to/file file:///path/to/file

Inserting any other host name in there will make recent versions of curl to return an error.

Pay special attention to the third example above ( file:///path/to/file ). That is three slashes before the path. That is again an area with common mistakes and where browsers allow users to use the wrong syntax so as a special exception, curl on Windows also allows this incorrect format:

file://X:/path/to/file

… where X is a windows-style drive letter.

Without scheme[modifier]

As a convenience, curl also allows users to leave out the scheme part from URLs. Then it guesses which protocol to use based on the first part of the host name. That guessing is very basic as it just checks if the first part of the host name matches one of a set of protocols, and assumes you meant to use that protocol. This heuristic is based on the fact that servers traditionally used to be named like that. The protocols that are detected this way are FTP, DICT, LDAP, IMAP, SMTP and POP3. Any other host name in a scheme-less URL will make curl default to HTTP.

You can modify the default protocol to something other than HTTP with the --proto-default option.

Name and password[modifier]

After the scheme, there can be a possible user name and password embedded. The use of this syntax is usually frowned upon these days since you easily leak this information in scripts or otherwise. For example, listing the directory of an FTP server using a given name and password:

curl ftp://user:password@example.com/

The presence of user name and password in the URL is completely optional. curl also allows that information to be provide with normal command-line options, outside of the URL.

URLs[modifier]

The host name part of the URL is, of course, simply a name that can be resolved to an numerical IP address, or the numerical address itself. When specifying a numerical address, use the dotted version for IPv4 addresses:

curl http://127.0.0.1/

…and for IPv6 addresses the numerical version needs to be within square brackets:

curl http://[::1]/

When a host name is used, the converting of the name to an IP address is typically done using the system's resolver functions. That normally lets a sysadmin provide local name lookups in the /etc/hosts file (or equivalent).

Port number[modifier]

Each protocol has a "default port" that curl will use for it, unless a specified port number is given. The optional port number can be provided within the URL after the host name part, as a colon and the port number written in decimal. For example, asking for an HTTP document on port 8080:

curl http://example.com:8080/

With the name specified as an IPv4 address:

curl http://127.0.0.1:8080/

With the name given as an IPv6 address:

curl http://[fdea::1]:8080/

Path[modifier]

Every URL contains a path. If there's none given, "/" is implied. The path is sent to the specified server to identify exactly which resource that is requested or that will be provided.

The exact use of the path is protocol dependent. For example, getting a file README from the default anonymous user from an FTP server:

curl ftp://ftp.example.com/README

URLs[modifier]

For the protocols that have a directory concept, ending the URL with a trailing slash means that it is a directory and not a file. Thus asking for a directory list from an FTP server is implied with such a slash:

curl ftp://ftp.example.com/tmp/

FTP type

This is not a feature that is widely used.

URLs that identify files on FTP servers have a special feature that allows you to also tell the client (curl in this case) which file type the resource is. This is because FTP is a little special and can change mode for a transfer and thus handle the file differently than if it would use another mode.

You tell curl that the FTP resource is an ASCII type by appending ";type=A" to the URL. Getting the 'foo' file from example.com's root directory using ASCII could then be made with:

curl "ftp://example.com/foo;type=A"

And while curl defaults to binary transfers for FTP, the URL format allows you to also specify the binary type with type=I:

curl "ftp://example.com/foo;type=I"

Finally, you can tell curl that the identified resource is a directory if the type you pass is D:

curl "ftp://example.com/foo;type=D"

…this can then work as an alternative format, instead of ending the path with a trailing slash as mentioned above.

Fragment

URLs offer a "fragment part". That's usually seen as a hash symbol (#) and a name for a specific name within a web page in browsers. curl supports fragments fine when a URL is passed to it, but the fragment part is never actually sent over the wire so it doesn't make a difference to curl's operations whether it is present or not.

Browsers' "address bar"

It is important to realize that when you use a modern web browser, the "address bar" they tend to feature at the top of their main windows are not using "URLs" or even "URIs". They are in fact mostly using IRIs, which is a superset of URIs to allow internationalization like non-Latin symbols and more, but it usually goes beyond that, too, as they tend to, for example, handle spaces and do magic things on percent encoding in ways none of these mentioned specifications say a client should do.

The address bar is quite simply an interface for humans to enter and see URI-like strings.

Sometimes the differences between what you see in a browser's address bar and what you can pass in to curl is significant.

Many options and URLs[modifier]

As mentioned above, curl supports hundreds of command-line options and it also supports an unlimited number of URLs. If your shell or command-line system supports it, there's really no limit to how long a command line you can pass to curl.

curl will parse the entire command line first, apply the wishes from the command-line options used, and then go over the URLs one by one (in a left to right order) to perform the operations.

For some options (for example -o or -O that tell curl where to store the transfer), you may want to specify one option for each URL on the command line.

curl will return an exit code for its operation on the last URL used. If you instead rather want curl to exit with an error on the first URL in the set that fails, use the --fail-early option.

Separate options per URL[modifier]

In previous sections we described how curl always parses all options in the whole command line and applies those to all the URLs that it transfers.

That was a simplification: curl also offers an option (-;, --next) that inserts a sort of boundary between a set of options and URLs for which it will apply the options. When the command- line parser finds a --next option, it applies the following options to the next set of URLs. The - -next option thus works as a divider between a set of options and URLs. You can use as many --next options as you please.

As an example, we do an HTTP GET to a URL and follow redirects, we then make a second HTTP POST to a different URL and we round it up with a HEAD request to a third URL. All in a single command line:

curl --location http://example.com/1 --next --data sendthis http://example.com/2 --next --head http://example.com/3

Trying something like that without the --next options on the command line would generate an illegal command line since curl would attempt to combine both a POST and a HEAD:

Warning: You can only select one HTTP request method! You asked for both POST Warning: (-d, --data) and HEAD (-I, --head).

Connection reuse[modifier]

Setting up a TCP connection and especially a TLS connection can be a slow process, even on high bandwidth networks.

It can be useful to remember that curl has a connection pool internally which keeps previously used connections alive and around for a while after they were used so that subsequent requests to the same hosts can reuse an already established connection.

Of course, they can only be kept alive for as long as the curl tool is running, but it is a very good reason for trying to get several transfers done within the same command line instead of running several independent curl command line invocations.

URL globbing[modifier]

At times you want to get a range of URLs that are mostly the same, with only a small portion of it changing between the requests. Maybe it is a numeric range or maybe a set of names. curl offers "globbing" as a way to specify many URLs like that easily.

The globbing uses the reserved symbols [] and {} for this, symbols that normally cannot be part of a legal URL (except for numerical IPv6 addresses but curl handles them fine anyway). If the globbing gets in your way, disable it with -g, --globoff .

While most transfer related functionality in curl is provided by the libcurl library, the URL globbing feature is not!

Numerical ranges[modifier]

You can ask for a numerical range with [N-M] syntax, where N is the start index and it goes up to and including M. For example, you can ask for 100 images one by one that are named numerically:

curl -O http://example.com/[1-100].png

and it can even do the ranges with zero prefixes, like if the number is three digits all the time:

curl -O http://example.com/[001-100].png

Or maybe you only want even numbered images so you tell curl a step counter too. This example range goes from 0 to 100 with an increment of 2:

curl -O http://example.com/[0-100:2].png

Alphabetical ranges[modifier]

curl can also do alphabetical ranges, like when a site has sections named a to z:

curl -O http://example.com/section[a-z].html

A list[modifier]

Sometimes the parts don't follow such an easy pattern, and then you can instead give the full list yourself but then within the curly braces instead of the brackets used for the ranges:

curl -O http://example.com/{one,two,three,alpha,beta}.html

Combinations[modifier]

You can use several globs in the same URL which then will make curl iterate over those, too. To download the images of Ben, Alice and Frank, in both the resolutions 100x100 and 1000x1000, a command line could look like:

curl -O http://example.com/{Ben,Alice,Frank}-{100x100,1000x1000}.jpg

Or download all the images of a chess board, indexed by two coordinates ranged 0 to 7:

curl -O http://example.com/chess-[0-7]x[0-7].jpg

And you can, of course, mix ranges and series. Get a week's worth of logs for both the web server and the mail server:

curl -O http://example.com/{web,mail}-log[0-6].txt

Output variables for globbing[modifier]

In all the globbing examples previously in this chapter we have selected to use the -O / -- remote-name option, which makes curl save the target file using the file name part of the used URL.

Sometimes that is not enough. You are downloading multiple files and maybe you want to save them in a different subdirectory or create the saved file names differently. curl, of course, has a solution for these situations as well: output file name variables.

Each "glob" used in a URL gets a separate variable. They are referenced as '#[num]' - that means the single letter '#' followed by the glob number which starts with 1 for the first glob and ends with the last glob.

Save the main pages of two different sites:[modifier]

curl http://{one,two}.example.com -o "file_#1.txt"

Save the outputs from a command line with two globs in a subdirectory;

curl http://{site,host}.host[1-5].example.com -o "subdir/#1_#2"

List options[modifier]

List all command-line options[modifier]

curl has more than two hundred command-line options and the number of options keep increasing over time. Chances are the number of options will reach 250 within a few years.

In order to find out which options you need to perform as certain action, you can, of course, list all options, scan through the list and pick the one you are looking for. curl --help or simply curl -h will get you a list of all existing options with a brief explanation. If you don't really know what you are looking for, you probably won't be entirely satisfied.

Then you can instead opt to use curl --manual which will output the entire man page for curl plus an appended tutorial for the most common use cases. That is a very thorough and complete document on how each option works amassing several thousand lines of documentation. To wade through that is also a tedious work and we encourage use of a search function through those text masses. Some people will appreciate the man page in its web version .

Config file[modifier]

You can easily end up with curl command lines that use a very large number of command- line options, making them rather hard to work with. Sometimes the length of the command line you want to enter even hits the maximum length your command-line system allows. The Microsoft Windows command prompt being an example of something that has a fairly small maximum line length.

To aid such situations, curl offers a feature we call "config file". It basically allows you to write command-line options in a text file instead and then tell curl to read options from that file in addition to the command line.

You tell curl to read more command-line options from a specific file with the -K/--config option, like this:

curl -K cmdline.txt http://example.com

…and in the cmdline.txt file (which, of course, can use any file name you please) you enter each command line per line:

  1. this is a comment, we ask to follow redirects --location
  1. ask to do a HEAD request --head

The config file accepts both short and long options, exactly as you would write them on a command line. As a special extra feature, it also allows you to write the long format of the options without the leading two dashes to make it easier to read. Using that style, the config file shown above can alternatively be written as:

  1. this is a comment, we ask to follow redirects location
  1. ask to do a HEAD request head

Command line options that take an argument must have its argument provided on the same line as the option. For example changing the User-Agent HTTP header can be done with

user-agent "Everything-is-an-agent"

To allow the config files to look even more like a true config file, it also allows you to use '=' or ':' between the option and its argument. As you see above it isn't necessary, but some like the clarity it offers. Setting the user-agent option again:

user-agent = "Everything-is-an-agent"

The argument to an option can be specified without double quotes and then curl will treat the next space or newline as the end of the argument. So if you want to provide an argument with embedded spaces you must use double quotes.

The user agent string example we have used above has no white spaces and therefore it can also be provided without the quotes like:

user-agent = Everything-is-an-agent

Finally, if you want to provide a URL in a config file, you must do that the --url way, or just with url , and not like on the command line where basically everything that isn't an option is assumed to be a URL. So you provide a URL for curl like this:

url = "http://example.com"

Default config file[modifier]

When curl is invoked, it always (unless -q is used) checks for a default config file and uses it if found. The file name it checks for is .curlrc on Unix-like systems and _curlrc on Windows.

The default config file is checked for in the following places in this order:

1. curl tries to find the "home directory": It first checks for the CURL_HOME and then the HOME environment variables. Failing that, it uses getpwuid() on Unix-like systems (which returns the home directory given the current user in your system). On Windows, it then checks for the APPDATA variable, or as a last resort the '%USERPROFILE%\Application Data'.

2. On Windows, if there is no _curlrc file in the home directory, it checks for one in the same directory the curl executable is placed. On Unix-like systems, it will simply try to load .curlrc from the determined home directory.

Passwords[modifier]

Passwords and snooping[modifier]

Passwords are tricky and sensitive. Leaking a password can make someone else than you access the resources and the data otherwise protected.

curl offers several ways to receive passwords from the user and then subsequently pass them on or use them to something else.

The most basic curl authentication option is -u / --user . It accepts an argument that is the user name and password, colon separated. Like when alice wants to request a page requiring HTTP authentication and her password is '12345':

$ curl -u alice:12345 http://example.com/

Command line leakage[modifier]

Several potentially bad things are going on here. First, we are entering a password on the command line and the command line might be readable for other users on the same system (assuming you have a multi-user system). curl will help minimize that risk by trying to blank out passwords from process listings.

One way to avoid passing the user name and password on the command line is to instead use a .netrc file or a config file . You can also use the -u option without specifying the password, and then curl will instead prompt the user for it when it runs.

Network leakage[modifier]

Secondly, this command line sends the user credentials to an HTTP server, which is a clear- text protocol that is open for man-in-the-middle or other snoopers to spy on the connection and see what is sent. In this command line example, it makes curl use HTTP Basic authentication and that is completely insecure.

There are several ways to avoid this, and the key is, of course, then to avoid protocols or authentication schemes that sends credentials in the plain over the network. Easiest is perhaps to make sure you use encrypted versions of protocols. Use HTTPS instead of HTTP, use FTPS instead of FTP and so on.

If you need to stick to a plain text and insecure protocol, then see if you can switch to using an authentication method that avoids sending the credentials in the clear. If you want HTTP, such methods would include Digest ( --digest ), Negotiate ( --negotiate. ) and NTLM ( -- ntlm ).

The progress meter[modifier]

curl has a built-in progress meter. When curl is invoked to transfer data (either uploading or downloading) it can show that meter in the terminal screen to show how the transfer is progressing, namely the current transfer speed, how long it has been going on and how long it thinks it might be left until completion.

The progress meter is inhibited if curl deems that there is output going to the terminal, as then would the progress meter interfere with that output and just mess up what gets displayed. A user can also forcibly switch off the progress meter with the -s / --silent option, which tells curl to hush.

If you invoke curl and don't get the progress meter, make sure your output is directed somewhere other than the terminal.

curl also features an alternative and simpler progress meter that you enable with -# / -- progress-bar . As the long name implies, it instead shows the transfer as progress bar.

At times when curl is asked to transfer data, it can't figure out the total size of the requested operation and that then subsequently makes the progress meter contain fewer details and it cannot, for example, make forecasts for transfer times, etc.

Units[modifier]

The progress meter displays bytes and bytes per second.

It will also use suffixes for larger amounts of bytes, using the 1024 base system so 1024 is one kilobyte (1K), 2048 is 2K, etc. curl supports these:

Suffix Amount Name

K 2^10 kilobyte

M 2^20 megabyte

G 2^30 gigabyte

T 2^40 terabyte

P 2^50 petabyte

The times are displayed using H:MM:SS for hours, minutes and seconds.

Progress meter legend

The progress meter exists to show a user that something actually is happening. The different fields in the output have the following meaning:

% Total  % Received % Xferd Average Speed Time Curr. Dload Upload Total Current Left Speed 0 151M 0 38608 0 0 9406 0 4:41:43 0:00:04 4:41:39 9287

From left to right:

Title Meaning

% Percentage completed of the whole transfer

Total Total size of the whole expected transfer (if known)

% Percentage completed of the download

Received Currently downloaded number of bytes

% Percentage completed of the upload

Xferd Currently uploaded number of bytes

Average Speed Dload

Average Speed Upload

Average transfer speed of the entire download so far, in number of bytes per second

Average transfer speed of the entire upload so far, in number of bytes per second

Time Total Expected time to complete the operation, in HH:MM:SS notation for

hours, minutes and seconds

Time

Current

Time passed since the start of the transfer, in HH:MM:SS notation for hours, minutes and seconds

Time Left Expected time left to completion, in HH:MM:SS notation for hours,

minutes and seconds

Curr.Speed Average transfer speed over the last 5 seconds (the first 5 seconds of a

transfer is based on less time, of course) in number of bytes per second