View Full Version : TCP/IP Update
mbbrutman
November 10th, 2006, 06:00 AM
I'm still working on my TCP/IP for small PC clones. It's been a while since I posted an update, so here is what it looks like.
The code supports the following:
Connecting to another machine ("active" connect)
"Listen" support for accepting incoming connections ("passive" connect)
Multiple open socket connections
MTU size setting for different network types/topologies
User defined receive buffer .. used for getting max performance
Correct TCP 'window size' advertising; helps with flow control
A good tracing and logging facility
Still to do:
Better TCP 'reset connection' support
I need to revamp the UDP support - it's been ignored for months
DNS requests over UDP
Performance work - I need to put in a high performance memcpy routine.
The code runs on any IBM clone with a packet driver. It has tested with a generic 386-40 with an NE2000 card, an IBM PCjr with a WD8003 card, an IBM XT with a 3Com 3C503 and an IBM XT with an Intel EtherExpress 8/16.
Even though I'm adding features and required support, performance is still very good. The last round of changes show the 386-40 receiving a file from another machine at 451KB/sec, which includes the overhead of buffering and writing to disk. The same code on an XT with the original 10MB hard drive runs at about 25KB/sec, and the PCjr is in the same range. After the faster memcpy code goes in it will be far faster.
Right now there are two apps:
NetCat: a version of the Linux utility. Lets you send and receive data that you pipe in/redirect from stdin and stdout. I've been using this for my file transfer tests.
A simple 'echo server' which just sends back whatever you send it. This handles up to 5 incoming connections simulataneously and it is used to test the listen support.
Both apps are under 50KB in size, and the total memory requirement is well under 128K including the packet driver.
I've been working on this for almost a year now. A year ago I couldn't get the packet driver to send a byte on the Ethernet. This has been the most difficult home programming project that I have ever attempted.
The ultimate goal is a telnet BBS running on some vintage hardware (PCjr?) that will use this. The code is stable enough were I could start thinking about BBS features, but I really want to polish the TCP/IP code first to make sure that it doesn't cause me hard-to-debug problems later.
Mike Chambers
November 10th, 2006, 12:03 PM
that's awesome! i'm glad to hear it's coming along. i'd love to use it for my programs once you get DNS resolution working. i can just communicate with it via interrupt calls like wattcp and tcpdrv, right?
if you give me the interface specs i would love to write quickbasic code to access it.
on a similar note, do you have documentation for the DNS resolution protocol? i can't find any decent information on it. i've been trying to do so for a long time since there isn't anything built into either wattcp or tcpdrv, afaik. i know there isnt for sure in tcpdrv.
EDIT: i meant to ask what kind of hard drive you're using in the 386/40 when you test file transfer and include disk write overhead. it makes all the difference in the world. with speeds like that, i assume it's an IDE right? is it a newer one or and older one?
mbbrutman
November 10th, 2006, 12:29 PM
It's a C library that you link to. So your QuickBASIC code is not going to be able to use it in it's current form. I can give it a software interrupt interface like WatTCP and Trumpet, but that won't be soon. That would also require some significant changes. (See below)
To use it you are going to have to work in a language that will let you link against the library. Turbo C++ 3.0 is guaranteed good ... There may be others, like a Borland PASCAL of the same vintage that will be able to link the library in and use it.
The problem with a TSR interface like WatTCP and Trumpet is that they can't use heap storage - everything is dynamically allocated. Performance in a lot of cases is a function of buffer space, and if you are confined into small buffer spaces you potentially lose a lot of performance. The NCSA Telnet code (which has an excellent FTP server) uses the same approach that I do. WatTCP and Trumpet could have let the user allocate the space for them, but that complicates things.
My DNS reference is in text books. I'm using Douglas Comer's "Internetworking with TCP/IP", which is good but in some parts dated. (It is a 15 year old book already.) For the most up to date reference I am using Charles Kozierok's "TCP/IP Guide", which is very new, 1500 pages, and $80. :-)
I though that TCPDRV and NTCPDRV could resolve addresses - you have to tell it the DNS server address.
The hard disk on the 386-40 is a 1.2GB Western Digital on a Promise 16 bit IDE controller. Nothing fancy. Beats the living daylights out of the XT, but it is nothing special for a 386. If I skip the hard disk writes I was getting over 600KB/sec, but that number will be lower now because of the additional memcpy overhead that I've added. (It was unavoidable if I wanted the code to be usable by humans.)
mbbrutman
November 11th, 2006, 09:34 AM
Btw, I'm looking for other C programmers. If you want to pick up a good language that is still in widespread use and is the core of most current operating systems, C is the way to go.
C or C++ on an old DOS machine is a little more challenging than C using a modern Linux distribution. Pointers are only 16 bits, and you have wrapping problems because of the segmentation. Most of the common library calls are there, so they are not a problem. For some of the stranger stuff you wind up using DOS or BIOS interrupts, which the compiler makes pretty easy to do.
My development environment is Turbo C++ 3.0 on a 386-40. The Watcom toolchain is supposed to be pretty good too. The compilers generally require a 386 because of their memory usage, but they generate code that runs on the 8088.
I'm willing to talk with Pascal bigots as well. It's close enough .. :-)
You are doing great stuff with QB, but based on what I've seen in your code you are being inhibited by the language and you may not realize it. The way that you use QB would be very natural in C.
Terry Yager
November 11th, 2006, 09:56 AM
Mike C,
If it's just a matter of not having the program, I've got a copy of Turbo C that I'll never use. It's buried in storage right now, but if ya want to try it, I'll dig it out for ya, for postage cost. I think it's version 3.0, but I ain't positive.
--T
chuckcmagee
November 11th, 2006, 02:08 PM
That's funny --- "the pascal bigots" :p
mbbrutman
November 11th, 2006, 02:25 PM
Yes, we all have our biases. I wasn't trying to hide mine very well. :-)
To me Pascal is like programming in a strait-jacket. Borland's extensions for the PC are tolerable because they allow you to get at things that straight Pascal doesn't allow for.
On the other hand, C is like playing with razor blades. Great for expert barbers, not recommended for small children. I've been programming in C for 16 years, 14 years as a professional. After getting used to reasonable C compilers on OS/2 and Linux, going back to the DOS environment with it's 16 bit pointers and 16 bit integers was a hard re-adjustment. And trying to interface to the packet driver (like I did for this project) required dropping into assembler, which was even ickier.
If you program under Windows or Linux you generally don't have to use assembler or worry about interrupt handling. Going back to DOS programming and doing this level of code has been a brain bender at times.
(I've been doing DOS apps for years. Just nothing that required such a high level of technical expertise. It can be much worse.)
Terry Yager
November 11th, 2006, 03:53 PM
Pascal is still considered less harmful than BASIC (especially when programming in a CP/M environment).
--T
Mike Chambers
November 11th, 2006, 07:14 PM
10 GOTO 10
Mike Chambers
November 11th, 2006, 07:15 PM
Mike C,
If it's just a matter of not having the program, I've got a copy of Turbo C that I'll never use. It's buried in storage right now, but if ya want to try it, I'll dig it out for ya, for postage cost. I think it's version 3.0, but I ain't positive.
--T
thanks for the offer terry! i'll see if i can find a copy some other way (wink wink) but if i can't i will PM you. :)
Mike Chambers
November 11th, 2006, 07:18 PM
You are doing great stuff with QB, but based on what I've seen in your code you are being inhibited by the language and you may not realize it. The way that you use QB would be very natural in C.
thanks. yeah i know the language is a bit lacking. it's mostly the lack of speed that'll drive you batty if you're using an older computer. there are ways to get around some of its other quirks... like reading var pointer locations to use strings as a sort of "byte array", etc...
but yeah. i think i'm going to get my hands on turbo C++ and start working with it. hopefully i can pick it up fairly quickly.
btw, i can't wait for your DOS telnet BBS server! i'm going to be running that :)
will it support multi-threading and door games?
Terry Yager
November 11th, 2006, 07:48 PM
thanks for the offer terry! i'll see if i can find a copy some other way (wink wink) but if i can't i will PM you. :)
I think ya can d/l it somewhere, but I'm not sure about documentation.
--T
mbbrutman
November 12th, 2006, 05:40 AM
Let's take the language discussion offline .. I kind of wanted to keep this thread for periodic status updates for the code. I've offered Mike a copy of Turbo C++ before, so if he can't get it easily there is a copy waiting for him.
Will it support multi-threading and door games?
Now there is an interesting question. What exactly does multi-threading mean on a DOS machine? It's not like DOS supports threads. ;-)
But to answer the question, the BBS will allow for multiple users to be logged in at the same time, so yes, it is multi-threaded. To do something like this you have a data structure for each active connection/user that tells you what the state of that connection is. At a low level the TCP/IP socket data structure maintains the state of the connection, while a different data structure will maintain the state of the user. ie: What menu are they in, what message might they be reading, how long have they been idle, etc.
The main loop will have to poll the connections and see which ones need servicing. When a user hits a key on their telnet client far far away, that keystroke gets sent to the machine and buffered. If the main loop detects a non-empty buffer that means the user is doing something. Depending on what is going on, you might just have to copy the keystrokes to a string or you might have to act on the keystroke and do something important (like finally post the msg they've been composing for 20 minutes.)
If the TCP/IP code is good it will be able to keep receiving and buffering keystrokes even while the 4.77Mhz PCjr is busy doing something else, like accessing the disk to retreive a msg.
Door games? Given infinite time, sure. ;-)
The bigger question is what should a BBS provide. I'm thinking just a basic message board. File downloading doesn't make a lot of sense because nobody is going to have a client that supports that. (How many telnet clients do you know of that support a file transfer inline? None) Old terminal emulators let you do this, but no modern telnet client lets you. File downloads are probably better done using something standard like HTTP or FTP.
I wrote BBS software 20 years ago that ran for about 2 years in NYC. Back then of course it was single user and dialup. The icky part about writing BBS software is the message editing software. Unless you make ANSI codes a requirement, full screen editing is not possible. Most people won't deal with line editing, and even with ANSI codes full screen editing is a pain to do.
I don't want to make ANSI codes required .. that would hurt the feelings of our C64 friends. :-)
carlsson
November 12th, 2006, 10:17 AM
A 2K custom character set should be able to emulate ANSI graphics quite well on the C64. I'm quite sure it has been done multiple times over the years as well.
mbbrutman
November 12th, 2006, 10:39 AM
I suspect that there are not too many C64s running TCP/IP, either over SLIP or Ethernet. There are a few though: http://www.dunkels.com/adam/tfe/
I think that straight ASCII (those chars that map to PETSCII) with 40 columns is best for our Commie friends.
mbbrutman
November 23rd, 2006, 07:01 AM
Latest performance numbers. These are sends and receives of a large file, using real disk I/O. ie: Instead of just throwing the packets away to measure the speed of my code, I'm measuring actual file transfer speed:
386DX-40 with IDE hard disk, NE2000 card, 3.1MB file
Send: 417KB/sec (14KB disk read buffer)
Receive: 472KB/sec (14KB disk write buffer)
IBM PC XT with original 10MB hard disk, 3COM 3C503, 1.3MB file
Send: 26.5KB/sec (14KB disk read buffer)
Receive: 24.7KB/sec (16KB disk read buffer)
Performance is fairly good with read and write buffers as small as 4KB on the XT, and 10KB on the 386-40.
Just for comparsion, I used NCSA Telnet on the XT and measured it's FTP performance. NCSA can receive at 16KB/sec, and send at 21KB/sec. The NCSA code is fairly good, so I'm pretty happy to have beaten it by such a wide margin.
Next up .. fix my TCP/IP reset handling code, add some code to cleanup half completed connections, and then I'll make this code (netcat) available for download. After that, it's BBS time.
Mike Chambers
November 24th, 2006, 11:49 AM
Let's take the language discussion offline .. I kind of wanted to keep this thread for periodic status updates for the code. I've offered Mike a copy of Turbo C++ before, so if he can't get it easily there is a copy waiting for him.
Now there is an interesting question. What exactly does multi-threading mean on a DOS machine? It's not like DOS supports threads. ;-)
But to answer the question, the BBS will allow for multiple users to be logged in at the same time, so yes, it is multi-threaded. To do something like this you have a data structure for each active connection/user that tells you what the state of that connection is. At a low level the TCP/IP socket data structure maintains the state of the connection, while a different data structure will maintain the state of the user. ie: What menu are they in, what message might they be reading, how long have they been idle, etc.
The main loop will have to poll the connections and see which ones need servicing. When a user hits a key on their telnet client far far away, that keystroke gets sent to the machine and buffered. If the main loop detects a non-empty buffer that means the user is doing something. Depending on what is going on, you might just have to copy the keystrokes to a string or you might have to act on the keystroke and do something important (like finally post the msg they've been composing for 20 minutes.)
If the TCP/IP code is good it will be able to keep receiving and buffering keystrokes even while the 4.77Mhz PCjr is busy doing something else, like accessing the disk to retreive a msg.
Door games? Given infinite time, sure. ;-)
The bigger question is what should a BBS provide. I'm thinking just a basic message board. File downloading doesn't make a lot of sense because nobody is going to have a client that supports that. (How many telnet clients do you know of that support a file transfer inline? None) Old terminal emulators let you do this, but no modern telnet client lets you. File downloads are probably better done using something standard like HTTP or FTP.
I wrote BBS software 20 years ago that ran for about 2 years in NYC. Back then of course it was single user and dialup. The icky part about writing BBS software is the message editing software. Unless you make ANSI codes a requirement, full screen editing is not possible. Most people won't deal with line editing, and even with ANSI codes full screen editing is a pain to do.
I don't want to make ANSI codes required .. that would hurt the feelings of our C64 friends. :-)
yeah, years ago when i was about 13 or 14 i wrote a multinode BBS proggie in qbasic. it actually WORKED, too :)
i received about 3 to 5 calls a day on average. didn't support doors, though. i eventually had to stop adding features after qbasic decided the .BAS was too big to load. (i think 32 KB is the limit?)
i wish i knew about quickbasic 4.5 back then :(
and btw, hyperterminal supports inline file xfer :)
grant
November 24th, 2006, 01:50 PM
How good are you guys with programming TCP/IP on windows?
In the next few months I will be trying to create a WinAMP plugin that instead of passing the MP3 file to be played will transmit it over a TCP or UDP/IP socket to the Altair. The Altair will have a STA013 to decode the MP3. All we need is 16k/s for a 128kbit MP3.
Optionally it would be nice to transmit auxillary data to the Altair like the ID3 tag, as well as ask the Altair the sense switch positions to control the WinAMP program.
This would be cool. :)
mbbrutman
November 25th, 2006, 06:06 AM
I don't do Windows. :-)
And that's is actually more of a WinAMP question, isn't it? Windows programming is bad enough, but then you need to interface to WinAMP.
How are you going to connect the Altair and what software are you going to run on it? UDP is pretty easy on a small machine, but a real TCP/IP socket is going to be very tight and somewhat slow. (TCP likes 32 bit ints.)
grant
November 25th, 2006, 05:59 PM
Probably UDP/IP with a custom sequence and acknowledge packet structure. Since its intended to be used on a local network and switches are so cheap the Altair shouldn't have any trouble. MAC addresses can be filtered without downloading the packet from the ethernet chip. Just peek at the packet and throw it away if we don't need it. Then its just an exercise of transfering bytes from the ethernet buffer to the MP3 decoder buffer.
mbbrutman
November 25th, 2006, 06:38 PM
Most recent Ethernet hardware only gives the packet to the software if the MAC address matches or if it is a broadcast packet. Putting it into promiscuous mode is the exception.
My first attempts at apps programming with my TCP stuff were based on UDP with my own sequence numbers (packet numbers) and acks. Kind of like doing Xmodem over Ethernet. It worked, but it takes a bit of extra effort to get the benefit of a sliding window. For a small machine I wouldn't even attempt it - ACKing after each packet is fine by me.
How are you going to put the machine on Ethernet? Do you have a card for the bus on the Altair, or are you planning to design one?
grant
November 26th, 2006, 01:57 AM
A few years ago I connected the CS8900 to an AVR and played around with that.
I can handle writing simple kernel device drivers, TCP/IP programming for unix and windows console, but the GUI stuff just makes me loose interest.
I'm using the PacketWhacker from edtp.com.
Maybe I was wasting my time checking the MAC address on every packet. The AVR is nearly 20 times faster than the 8080 though! I had time to waste! ;)
mbbrutman
November 26th, 2006, 06:27 AM
Wow - I liked edtp.com. I'm going to have to browse that more.
I'm a software weenie, but I understand quite a bit of hardware. I've been looking to 'graft' Ethernet onto my non-standard PCjrs, and had been looking at something similar from embeddedethernet.com. The idea is the same - a basic 10Mb/sec Ethernet chipset with minimal I/O interfacing requirements.
Things get interesting when you move from application space down into TCP/IP stack. You can do a bare-bones TCP implementation that barely meets spec but will talk to other machines in less than 16K. But a good one with proper flow control, listen support, etc. takes a quite a bit more.
Then there is the issue of the Ethernet hardware. I'm using packet drivers so this work is done for me, but something has to initialize the hardware and service it. That's not trivial code.
Promiscuous mode can be set in the hardware .. it's interesting, but letting the hardware do the filtering is obviously much easier.
mbbrutman
December 16th, 2006, 11:41 AM
Getting closer to a public test ..
I've added TCP reset support recently, and now I'm 'scrubbing' the code for correct behavior, and memory leaks.
For the first public test I'm going to put out a simple 'echo server' that echoes back what you send to it. It'll have a few more goodies too to make it slightly more interesting, like a command to show you how long it has been running, a command to show the number of active connections, etc.
The idea is that if it survives for a few days and gets a few hundred connections, then I probably don't have any memory leak problems. I'm going to log the TCP/IP packets too so that I can see if my code is causing problems.
Mike Chambers
December 16th, 2006, 12:27 PM
Getting closer to a public test ..
I've added TCP reset support recently, and now I'm 'scrubbing' the code for correct behavior, and memory leaks.
For the first public test I'm going to put out a simple 'echo server' that echoes back what you send to it. It'll have a few more goodies too to make it slightly more interesting, like a command to show you how long it has been running, a command to show the number of active connections, etc.
The idea is that if it survives for a few days and gets a few hundred connections, then I probably don't have any memory leak problems. I'm going to log the TCP/IP packets too so that I can see if my code is causing problems.
i'll be sure to connect and mess around with it for you mike.
mbbrutman
December 27th, 2006, 09:16 PM
Still lots to do, but I'm itching to test it. So it's out there for your connecting pleasure.
Telnet to 24.159.203.149, port 2301
That ip address is my Linux machine. Port 2301 is forwarded to my 386-40 running my homebrew TCP/IP, and a small server program. The server program handles up to 9 simultaneous users and does some simple tasks, like report the number of open sockets, free memory, etc.
If it runs through the night without corrupting anything I'll be pretty happy. And if it doesn't, I've got all sorts of trace logs running so that I can figure out what the problem is. :-(
On Windows machines you can use the built-in telnet client - the command looks like "telnet 24.159.203.149 2301". Same on Linux. You can connect multiple times from the same machine if you are inclined - the stats will change as you do it.
Edit: It's still running .. feel free to try it out and help me test it! I will edit this post again when the test is over.
carlsson
December 29th, 2006, 11:26 AM
Testing, testing. I'm using PuTTY + telnet from Linux simulatenously. Are there any Easter eggs for us to discover? ;-)
Once I ended a session prematurely, but the socket remained allocated for a while (as observed from the other connection). After a while, it disappeared, maybe due to a timeout.
Otherwise everything worked fine until I performed a little stress test that seems to have brought down the server... :-(
I issued the "info" command 40 times in a row. The server answered the first 12 requests, but ignored the rest. Before the stress test, "mem" reported c:a 448000 bytes left. Afterwards, it was only 1136 bytes and then the server seems to have died.
I hope I didn't cause any trouble, but it looks like you have a major memory leak if a small flood of commands consumes all the memory. Yes, I'm aware it is just a 386 but preferrably it should be able to withstand something like that.
mbbrutman
December 29th, 2006, 11:38 AM
Carlsson - my hero!
I know that it died from corrupted memory, and I know that it happened in the last hour or so. I was just starting to dig through the logs to find it.
I'm going to recreate it here and see where I goofed up ...
Otherwise, it had been running for over 36 hours. It didn't handle a large number of connections, but I wasn't expecting that kind of testing. :-)
mbbrutman
December 29th, 2006, 11:58 AM
And the preliminary cause of death is bad reset handling ...
I see from your side that things were going fine, and then your machine reset the TCP/IP connection. My machine screwed up after that and sent the 'Hello' message to what should have been a dead socket. That's a problem ...
Time to debug - thanks again!
Mike
carlsson
December 29th, 2006, 12:00 PM
First time I got credits for crashing a server... I suppose if you build a BBS on top of it, you may get plenty of concurrent traffic even if the number of clients are limited, so good to stress test it at an early stage.
mbbrutman
December 29th, 2006, 02:36 PM
The reset processing was a red herring, but that might be a problem anyway. What OS did you connect from?
In one of your sessions the socket was established, 240 chars of data was sent, and then a FIN packet was sent to close the connection. In transit the FIN packet arrived before the data, which I handled fine. Eventually your client started resetting the connection, which was bogus behavior but I handled it just fine.
I got hosed on deallocating the memory for the receive buffer. The reset processing found a path in the code where I could deallocate the same memory twice, which is a no-no. That's what corrupted the heap and crashed everything.
I didn't recreate the bug directly, but was able to simulate it by adding an extra line of code. Sure enough, the behavior is the same ...
Time to go back and scrub the code to ensure that I don't double delete a piece of storage again.
I'll never live long enough to get a BBS on top of this. It's been a long hard year already. :-)
carlsson
December 30th, 2006, 06:17 AM
I was connecting from a Debian 3.1 system, kernel 2.4.27 (P4 Xeon) but I don't think it should make a difference?
The session I ended prematurely was initialized like this:
$ telnet 24.159.203.149 2301 < batchfile
in an attempt to pipe in a batch file of commands. Obviously it is not possible to do something like this, and the session was ended. The batch file contained exactly 240 characters of data ("info"+CRLF repeated 40 times). Yet your server still was functioning relatively fine afterwards. I should have issued a "mem" command right after the prematurely ended session.
mbbrutman
December 30th, 2006, 06:39 AM
Oh, but it does make a difference?
I was wondering how you got those commands in that fast. It also explains why the Reset was sent.
So here is chain of events as I see it from a tcpdump log:
Initial connection
SYN packet back to your machine
ACK back from your machine (connection is now established)
FIN packet with a sequence number that is too high from your machine
My machine ignores the packet and sends an ACK to tell your machine what the expected sequence number is
Delayed packet with 240 chars of data from your machine arrives
My machine processes that packet, because it has the correct sequence numbers
My machine sends the hello message and ACKs your 240 chars of input
Your machine sends a RESET on the connection with an older sequence number (which is bogus)
My machine ignores the bogus RESET packet because it is out of window
My machine retries sending the initial 'hello' packet because it has timed out
My machine eventually times out trying to resend the packets, and kills the connection.
The extra delete of the receive buffer happens here .. and things start to go bad.
Your machine should not have sent the RESET packet because my machine did not ACK the FIN packet, which came in out of order and was not processed. Which is why I call the RESET packet bogus.
However, I didn't know that you were redirecting your input into telnet - that makes things different. As soon as telnet finished sending the input, it closed the connection and didn't wait for any output! So when my machine started sending output, it was sending it to a closed connection, and hence the reset packet.
Telnet was not well behaved here - it should have waited for output before hammering the connection and closing it. It didn't even wait for my side to ACK the FIN packet, or send a FIN packet of my own. Very bad behavior, and not in spec ..
Normal telnet would not have done this - this is a side effect of redirecting stdin.
Interesting overall .. now I understand why the RESET packet came in.
If I didn't have the double delete my machine would have recovered fine, even though the network sent packets out of order and your machine didn't wait for output or the final FIN from my side. A very good test though .. I'm going to add that to my testing now.
Mike
vBulletin® v3.7.3, Copyright ©2000-2008, Jelsoft Enterprises Ltd.