Homework 0 forum

Integration test

 
Picture of Mladen Korunoski
Integration test
by Mladen Korunoski - Friday, 25 September 2020, 20:36
 

Dear,

By running TestBinGossip_ChainSplit I noticed on line 160 that the binary gossiper #7 returns a message to our gossiper #6, hence the test fails.

This should not be the case as per the homework explanation:

If it comes from another peer A, the gossiper (1) stores A’s address from the “relay peer” field in its list of known peers, (2) changes the “relay peer” field to its own address, and (3) sends the message to all known peers besides peer A, leaving the “origin peer name” field unchanged.

Am I missing something?

Best,
Mladen

Picture of Utku Görkem Ertürk
Re: Integration test
by Utku Görkem Ertürk - Friday, 25 September 2020, 20:57
 

Same problem here.

Picture of Cristina Basescu
Re: Integration test
by Cristina Basescu - Friday, 25 September 2020, 22:53
 

The reference gossiper does not send the message back to peer A, so it seems that the problem is elsewhere.

Without further information, there's not much I can help with. Have you tried adding debug statements to check whether g6 receives any messages and how g6 ends up having a message in its queue of received messages?

Cristina

Picture of Utku Görkem Ertürk
Re: Integration test
by Utku Görkem Ertürk - Friday, 25 September 2020, 23:24
 
For me, g6 receives a message from g7.
Debug results (just after g6 received message):
current g: g6 packetOriginName:g6 packetPeerAddress:127.0.0.1:2007

Picture of Cristina Basescu
Re: Integration test
by Cristina Basescu - Friday, 25 September 2020, 23:35
 

And which address does g6 use to send the message to g7?

Picture of Utku Görkem Ertürk
Re: Integration test
by Utku Görkem Ertürk - Saturday, 26 September 2020, 00:22
 

It uses 127.0.0.1:2006

Moreover,g6 directly sends to g7 and only g7.

When I looked at known addresses that g6 sent:

1-current g: g6 packetPeerName:g6 packetPeerAddress:127.0.0.1:2007 conn.LocalAddr().String():127.0.0.1:2006

Picture of Morten Borup Petersen
Re: Integration test
by Morten Borup Petersen - Saturday, 26 September 2020, 10:46
 
I experienced a very similar error to this, ie. observing a pattern of the provided reference implementation g7 returning a message to g6 even though that would be against the specification.

I took a look at the set of messages exchanged between in Wireshark; here we see that i was not using g6's specified port as a source port for the outgoing UDP message, but rather just relied on the OS providing one. Now, as far as i understand, this should not have an effect on the behavior, given that the specification states that the relay address should be taken from the relay peer address of the packet (which is correct, see f1).

(Very speculatively) this makes me think that there might be logic in the reference implementation which adds a packets' relay address to the set of known peers ((1) stores A’s address from the “relay peer” field in its list of known peers), but using the actual UDP packet source information for the peer exclusion ((3) sends the message to all known peers besides peer A) (f2). 




Picture of Reka Inovan
Re: Integration test
by Reka Inovan - Saturday, 26 September 2020, 10:55
 

Hi All,

TLDR: This error might happens because you don't reuse the listening Conn object.

Edit : Turns out morten is already saying the same thing :)

I was also having the same problem. After some debugging, I discovered that the problem is because the reference implementation actually considers the "source port" field of the UDP packet (see here) as authoritative. I think it's because this field is returned as the second return value when you call golang UDP read function.

If the RelayPeerAddr and the "source port" differs, the reference implementation will think that the message is sent from the "source port". This results in the reference implementation relaying the message back to the  RelayPeerAddr.

So, this happens when you don't reuse the Conn object that you use for listening when sending the packet. In this case, the laddr of the new Conn object will be assigned an ephemeral port which differs from your gossiper address.  My implementation pass all the tests after I made this change. I know we might worry whether it is "safe", but it turns out the Conn object is designed to be called from multiple coroutine according to the documentation.

I hope it helps.

Reka

Picture of Utku Görkem Ertürk
Re: Integration test
by Utku Görkem Ertürk - Saturday, 26 September 2020, 15:23
 

Hi

Morten and Reka thank you. You are right. That's my problem. Now, I should figure out how I can reuse the conn. Since when I tried DialUDP with laddr, it gave bind error(bind: Only one usage of each socket address). There are several os specific packages to reuse port( using SO_REUSEPORT option). However, I am not sure this is the correct way to go since writing specific os configurations for reusing port kind of overkill.

Cristina you were right about address problem. I just made a mistake when got outputs of debug run. 

Thank you all.

Gorkem


Picture of Utku Görkem Ertürk
Re: Integration test
by Utku Görkem Ertürk - Saturday, 26 September 2020, 16:15
 

I fixed it. Thank you. (Hint: conn can write to another address)