Bug 87383 - [WebSocket] Send requires super linear time against data size
Summary: [WebSocket] Send requires super linear time against data size
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: WebCore Misc. (show other bugs)
Version: 528+ (Nightly build)
Hardware: All All
: P2 Normal
Assignee: Takashi Toyoshima
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-05-24 06:05 PDT by Takashi Toyoshima
Modified: 2012-06-04 17:52 PDT (History)
5 users (show)

See Also:


Attachments
Patch (6.81 KB, patch)
2012-05-24 06:08 PDT, Takashi Toyoshima
no flags Details | Formatted Diff | Diff
Patch (8.57 KB, patch)
2012-05-25 04:09 PDT, Takashi Toyoshima
no flags Details | Formatted Diff | Diff
Patch (8.58 KB, patch)
2012-05-25 05:54 PDT, Takashi Toyoshima
no flags Details | Formatted Diff | Diff
Patch (8.58 KB, patch)
2012-05-28 00:02 PDT, Takashi Toyoshima
no flags Details | Formatted Diff | Diff
Patch (8.56 KB, patch)
2012-05-29 00:19 PDT, Takashi Toyoshima
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Takashi Toyoshima 2012-05-24 06:05:14 PDT
Chromium side issue is here.
http://code.google.com/p/chromium/issues/detail?id=123228

WebSocket send operation requires super liner time to send data against data size.
This is because SocketStreamHandleBase requires O(n^2) time against buffed data size.

My simple investigation results are here.

     |  Chrome  | Chrome with this patch
=====+==========+=======================
  1M |    300ms |    333ms
  2M |    700ms |    660ms
 10M |   4700ms |   3300ms
 20M |  13000ms |   6600ms (x2 faster)
100M | 202000ms |  33000ms (x6 faster)

After applying this patch, sending time becomes linear to data size.

I also notice that receiving and binary handling operations have other bottle necks to be fixed.
Comment 1 Takashi Toyoshima 2012-05-24 06:08:57 PDT
Created attachment 143808 [details]
Patch
Comment 2 Yuta Kitamura 2012-05-24 07:23:51 PDT
Comment on attachment 143808 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=143808&action=review

> Source/WebCore/ChangeLog:7
> +

You should explain why the current implementation requires super-linear time and how you resolve that.

> Source/WebCore/platform/network/SocketStreamHandleBase.cpp:39
> +const unsigned int vectorCoalescingSize = 4096;

Looks too small to me, but I might be wrong. I feel like 10KiB or even 100KiB is safe. Ultimately profiling is required to figure out the best value.

If my memory is correct, allocating 4096 bytes tended to be inefficient, because that requires allocating two pages instead of one as malloc needs 4096 bytes plus managing area. But the system has evolved and might have changed since then.

> Source/WebCore/platform/network/SocketStreamHandleBase.h:60
> +        class Block : public Vector<char> {

I'm not particularly a fan of deriving a class that doesn't have a virtual destructor. This sounds like a sign of poor design to me.

> Source/WebCore/platform/network/SocketStreamHandleBase.h:65
> +            Block* prev() { return m_prev; };
> +            Block* next() { return m_next; };

nit: The last colon isn't necessary

> Source/WebCore/platform/network/SocketStreamHandleBase.h:83
> +            DoublyLinkedList<Block> m_buffer;

Why do you need DoublyLinkedList and need to manage memory manually? Is there any specific reason you can't use "Deque<Vector<char> >"? In this way you don't have to manage Block's memory by hand.
Comment 3 Alexey Proskuryakov 2012-05-24 13:23:04 PDT
This is a whole lot of generic looking code for SocketStreamHandleBase. This looks something that may need to be abstracted out into WTF if we really need it.

> I'm not particularly a fan of deriving a class that doesn't have a virtual destructor. This sounds like a sign of poor design to me.

Private inheritance is sometimes OK, but public is dangerous.
Comment 4 Takashi Toyoshima 2012-05-25 04:09:47 PDT
Created attachment 144041 [details]
Patch
Comment 5 Takashi Toyoshima 2012-05-25 05:54:41 PDT
Created attachment 144050 [details]
Patch
Comment 6 Takashi Toyoshima 2012-05-25 05:59:53 PDT
4KiB is clearly small for this.
Third patch applies 128KiB for block size.

Deque size will be 128MiB/128KiB < 1000 and it looks enough small to be kept per WebSocket object.
Comment 7 Takashi Toyoshima 2012-05-28 00:02:06 PDT
Created attachment 144284 [details]
Patch
Comment 8 Takashi Toyoshima 2012-05-28 00:07:39 PDT
After some performance tests, I notice that 128kB is still small.

     |  Chrome  | w/1st CL | 128kB blk | 1MB blk
=====+==========+==========+===========+--------
  1M |    300ms |    333ms |     333ms |   330ms
  2M |    700ms |    660ms |     666ms |   660ms
 10M |   4700ms |   3300ms |    3500ms |  3300ms
 20M |  13000ms |   6600ms |    7500ms |  6600ms
100M | 202000ms |  33000ms |   60100ms | 36000ms

Kent-san, could you look my 4th patch?
Comment 9 Kent Tamura 2012-05-28 00:56:36 PDT
Comment on attachment 144284 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=144284&action=review

> Source/WebCore/platform/network/SocketStreamHandleBase.h:67
> +        StreamBuffer<char, 1024 * 1024> m_buffer;

Does this mean we allocate 1MB even if we'd like to send a few octets?
Comment 10 Takashi Toyoshima 2012-05-28 01:09:33 PDT
Comment on attachment 144284 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=144284&action=review

>> Source/WebCore/platform/network/SocketStreamHandleBase.h:67
>> +        StreamBuffer<char, 1024 * 1024> m_buffer;
> 
> Does this mean we allocate 1MB even if we'd like to send a few octets?

Yes. And following sending data might be coalesced to the previous buffer.
When the pending data is sent, the buffer is released.
I think Vector<char> could be alternative to FixedArray<char, 1024 * 1024>.
But, it might cause performance regression on small but many payloads.
Comment 11 Kent Tamura 2012-05-28 20:10:33 PDT
(In reply to comment #8)
> After some performance tests, I notice that 128kB is still small.
> 
>      |  Chrome  | w/1st CL | 128kB blk | 1MB blk
> =====+==========+==========+===========+--------
>   1M |    300ms |    333ms |     333ms |   330ms
>   2M |    700ms |    660ms |     666ms |   660ms
>  10M |   4700ms |   3300ms |    3500ms |  3300ms
>  20M |  13000ms |   6600ms |    7500ms |  6600ms
> 100M | 202000ms |  33000ms |   60100ms | 36000ms
> 
> Kent-san, could you look my 4th patch?

I feel 128KiB is enough.  Sending 100MB frame sounds not realistic.
Comment 12 Kent Tamura 2012-05-28 20:25:20 PDT
Comment on attachment 144284 [details]
Patch

View in context: https://bugs.webkit.org/attachment.cgi?id=144284&action=review

> Source/WTF/wtf/StreamBuffer.h:50
> +    }
> +    ~StreamBuffer()

nit: Please add a blank line between function definitions.

> Source/WTF/wtf/StreamBuffer.h:54
> +    }
> +    bool isEmpty() const { return !size(); }
> +    void append(const T* data, size_t size)

ditto.

> Source/WTF/wtf/StreamBuffer.h:63
> +                OwnPtr<Block> block = adoptPtr(new Block);
> +                m_buffer.append(block.release());

nit: You can omit the variable 'block'.

> Source/WTF/wtf/StreamBuffer.h:75
> +    }
> +    void consume(size_t size)

Please add a blank line.
We should add a comment that we can't specify size which is >= firstBlockSize().

> Source/WTF/wtf/StreamBuffer.h:81
> +        if (m_readOffset == BlockSize) {

We had better use >= for safety.

> Source/WTF/wtf/StreamBuffer.h:87
> +    }
> +    size_t size() const { return m_size; }
> +    const T* firstBlockData() const

Please add blank lines.

> Source/WTF/wtf/StreamBuffer.h:93
> +    }
> +    size_t firstBlockSize() const

Please add a blank line.

> Source/WebCore/platform/network/SocketStreamHandleBase.cpp:2
> - * Copyright (C) 2009, 2011 Google Inc.  All rights reserved.
> + * Copyright (C) 2012 Google Inc.  All rights reserved.

We don't replace existing years in WebKit.

> Source/WebCore/platform/network/SocketStreamHandleBase.h:3
> - * Copyright (C) 2009 Google Inc.  All rights reserved.
> + * Copyright (C) 2012 Google Inc.  All rights reserved.

ditto.
Comment 13 Takashi Toyoshima 2012-05-29 00:19:44 PDT
Created attachment 144465 [details]
Patch
Comment 14 Takashi Toyoshima 2012-05-29 00:22:27 PDT
After an offline chatting with Kent, I decide to use Vector<char> instead of FixedArray. Actually, we have no idea how does 1MB allocation affect mobile platforms like iPhone. But, 100MB sending is also important for binary communication. In small size communication, using Vector<char> is equivalent to current implementation and introduced StreamBuffer achieve x6 times performance improvement. I believe this approach keep or improve performance for all types of communication usages. It never causes regression.

Alexey,
I think you are familiar with mobile platform use cases. Any thoughts?

Kent,
I fixed points you make comments.
Comment 15 Takashi Toyoshima 2012-06-03 23:19:37 PDT
Kent-san,
There is no objection on this change for a week.
Could you review my last patch again?
Comment 16 Kent Tamura 2012-06-04 17:36:38 PDT
Comment on attachment 144465 [details]
Patch

ok
Comment 17 WebKit Review Bot 2012-06-04 17:52:14 PDT
Comment on attachment 144465 [details]
Patch

Clearing flags on attachment: 144465

Committed r119446: <http://trac.webkit.org/changeset/119446>
Comment 18 WebKit Review Bot 2012-06-04 17:52:21 PDT
All reviewed patches have been landed.  Closing bug.