Monday, July 21, 2008

The FTP Turn Off

Because I am an avid fan of people who are better technical writers than I - This is an article from a gentleman who has posted something near and dear to my heart.

The insecurity of FTP.

This is going to sound a little weird at first considering what I do for a living, but I want you to stop using FTP.

There are too many aspects of it which have not kept up with modern computing environments. In particular:

Unless tunneled over a secure socket, FTP is 100% insecure. Your password, and the contents of all of your files are sent in the clear, free to be examined or captured by any network hop between you and your server.

The spec defines no way of setting the modification dates/times of files. A number of non-standard extensions have arisen to deal with this shortcoming. Some servers support one but not the others. Some support neither. Some claim to support one method but misinterpret the the arguments, treating the timestamps as local time rather than UTC. I've seen FTP servers simply drop the connection whenever asked to set a timestamp on a file. For such a simple and necessary operation, it's chaos.

The spec defines no reliable way of determining the string encoding used for file names. We are able to get it right some of the time using educated guesses, but it's hardly reliable. The Internet has made the world smaller than ever, and the world simply needs to use protocols that support international character sets.

FTP was designed to be used interactively by a human sitting at a terminal, not by a GUI application working on the human's behalf. The spec doesn't even define the output format that should be used for directory listings. Half of the work in writing a decent FTP client is being able to interpret the hundreds of different types of directory listings you might receive without much hint as to which type the server is sending. This leads to all kinds of subtle glitches. For example, for files more than a year old appearing in Unix style directory listings it's impossible to determine their modification date without using additional (and likely unsupported) non-standard commands. Sending such per-file commands kills performance. It's also impossible to reliably handle unusual cases such as leading/trailing spaces in file names without hints from the user about the type of server on the other end.

The spec defines no way of dealing with file metadata, such as Unix permissions, owners, and groups. Again, various servers have implemented extensions for working with this, but you cannot rely on their presence or interoperability.

FTP requires a minimum of two socket connections to transfer a file: the control connection, which is established first, and then data connections which are created and destroyed every time you transfer a single file, or request a directory listing. This is deadly to your overall throughput, especially on a high-latency Internet connection. And worse, it leads to the next problem:

FTP is not friendly with firewalls. Because it constantly needs to establish new connections, this has led us to "passive mode" which might as well be black magic as far as most people are concerned. Briefly, passive mode means the client initiates data connections to the server, rather than the default where the server makes connections to the client (yes, really). Worse still, data connections occur on varying high port numbers (usually 49152-65335) which means sysadmins would have to open over 16,000 ports in the firewall, almost defeating the purpose of a firewall in the first place. It's a mess, and it's really hard to understand. Firewalls are a necessary evil for today's Internet, and our transfer protocols should be able to deal with them.

So, if not FTP, what should you use instead? Of what's available today, I'd recommend everyone switch to SFTP if you possibly can.

It's secure, it's consistently implemented, and it's machine-readable. That all adds up to a more reliable, future-proof transfer client for you.

I've talked to a lot of people who didn't even realize their host supported SFTP. If your hosting service supports SFTP, you usually don't have to change anything except for switching your client protocol from FTP to SFTP. If it doesn't work, you should ask your host if there's anything else you have to do (such as use a different port number).

If your host doesn't support SFTP, you should find a different host. It's not hard to support, and it's ridiculous to force people into using insecure protocols in the year 2008. Ask them, for example, why they don't support telnet. FTP is no better.

FTP has served us well, but it's time to move on. You wouldn't use a 23 year old computer to do your work, so don't use a protocol from the same vintage. Demand modern transfer protocols from your host.

Update
Several people have taken issue with me calling out the age of the protocol. After all, Ethernet, IP, Unix, HTML, and so on are also quite old, but seem to be holding up OK.

I guess it was a silly point to bring up. I hope it's at least obvious from the article that I'm not suggesting that FTP's age is its primary problem, but rather the issues in the bulleted list.

The difference between FTP and other old-but-still-useful tech is that the others have been updated periodically to keep pace with the rapid evolution of the industry.

Ethernet now has CAT6. IP is (sort of... slowly...) mutating into IPV6. Unix has had so many mutations it would be hard to name them all. HTML is coming up on version 5.

FTP is just FTP, pretty much same as it was when Jon Postel & co. wrote it. We've wrapped it in secure tunnels and thrown countless proprietary extensions at it (that nobody agrees on how to implement). But it's my opinion (and certainly not everybody's) that it's broken at a fundamental level for its intended purpose for today's Internet.

So, yes, the age of the protocol BY ITSELF is a non-argument. It's that it has languished for that long without any cleanup from any standards organization or committee. SFTP seems the best candidate to replace it since it is widely deployed, solves pretty much all the problems I mentioned, and in most cases is an easy substitution for end users to make. Of the realistic solutions to the problem (not "let's write a new protocol!") it's the most accessible.

Folks, I'm a Newton user. You don't have to tell me that age does not necessarily equal irrelevance.

(Note: Technically speaking, I even understated FTP's age. I was going by RFC959, which is the implementation still in use today. However, a reader reminds me that the core FTP functionality dates back to RFC354, drafted in 1972, and was designed for the trusted environment of ARPANET. It predates both TCP/IP and the internet as we know it today.)

Update 2
There is some confusion over what I mean by "SFTP". I'm referring to the SSH File Transfer Protocol, not FTP-over-SSL which is informally known as FTPS. FTPS addresses FTP's lack of encryption, but is otherwise exactly the same protocol as FTP, with exactly the same problems.



Originally at: http://stevenf.com/archive/dont-use-ftp.php