[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Ips] [rddp] Storage Maintenance (storm) BOF reminder & requests
Hi Robert,
an RNIC would be happy to transition an established
socket
into RDMA mode if the user wants to. No problem -
as far as i know -
to transition from TOE mode to RDMA mode at least.
It is a question
of envrionmental support to allow that. To allow that,
the environment (aka OFED) wold have to integrate
late
socket translation, which allows RNIC drivers to officially
do
TOE to RDMA mode transition or to take over a socket
(where the OS would have to expose a well-defined
TCP
context reallocation scheme).
Maybe it is not a good idea to change an end-to-end
protocol
because a host environment has functional limitations
which may stem from historical transport (IB) limitations
or
OS considerations.
many thanks,
bernard.
rddp-bounces at ietf.org wrote on 03/22/2009 04:41:55
PM:
> David:
>
> There are 2 issues I would like to suggest for discussion at the BOF
> meeting later this week. Both have to do with the iSER spec,
RFC 5046.
>
> 1. At the present time, as far as I know, no existing hardware,
> neither Infiniband nor iWARP, is capable of opening
a connection
> in "normal" TCP mode and then transitioning
it into zero-copy mode.
> Unfortunately, the iSER spec requires that.
> Can't we just replace that part of the iSER spec?
> Otherwise, all hardware and all implementations are
non-standard.
>
> 2. The OFED stack is used to access both Infiniband and iWARP hardware.
> This software requires 2 extra 64-bit fields for addressing
> on both Infiniband and iWARP hardware, but these fields
> are not allowed for in the current iSER Header Format.
> Can't we just add those extra fields to the iSER spec?
> If someday some other implementation doesn't need those
> fields, they can be just set to 0 (which is what is
implied by
> the current iSER standard anyway). Again, by not
doing this,
> all implementations are non-standard.
>
> In other words, I'm suggesting that we consider replacing the relevant
> parts of the current iSER specs with the current OFED specs on these
> 2 issues.
>
> Thanks for your consideration,
> Bob Russell
>
> Note: The following (old) posting by Mike Ko states that the
> extra header fields are needed only by IB, not by IETF
> (i.e., iWARP), because IB uses nZBVA, whereas iWARP uses ZBVA.
>
> But are there any IETF/iWARP implementations out there that actually
> use ZBVA with iWARP RNICs? (I don't mean software simulations
of
> the iWARP protocol.) We have built an iSER implementation that
> uses the OFED stack to access both IB and iWARP hardware, and for
> both of them we need to use the extra iSER header fields (nZBVA).
> Perhaps this is an issue with the design of the OFED stack, which
> was built primarily to access IB hardware and therefore reflects
> the needs of the IB hardware. But we found that the only way
to
> access iWARP hardware via the OFED stack was to used the expanded
> (nZBVA) iSER header (and to use a meaningful value in the extra field,
> NOT to just set it to zero).
>
> In any case, rather than have 2 different versions of the iSER header,
> it would be better to have just one, regardless of the underlying
> technology involved (after all, isn't that what a standard is for??).
> This is especially relevant when using the OFED stack, because,
> as we have demonstrated, software built on top of the OFED stack can
> (AND SHOULD!) be able to run with EITHER IB or iWARP hardware,
> with NO change to that software. Having 2 different iSER headers
> does NOT make that possible!
>
>
> > 2008/4/15 Mike Ko <mako at almaden.ibm.com>:
> >
> > VA is a concept introduced in an Infiniband annex to support
iSER. It
> > appears in the expanded iSER header for Infiniband use only to
support the
> > non-Zero Based Virtual Address (non-ZBVA) used in Infiniband
vs the ZBVA
> > used in IETF.
>
> Mike - could you please put me in contact with someone who has actually
> implemented iSER on top of IETF/iWARP hardware NICs using ZBVA?
>
> >
> > "The DataDescriptorOut describes the I/O buffer starting
with the immediate
> > unsolicited data (if any), followed by the non-immediate unsolicited
data
> > (if any) and solicited data." If non-ZBVA mode is used,
then VA points to
> > the beginning of this buffer. So in your example, the VA field
in the
> > expanded iSER header will be zero. Note that for IETF, ZBVA is
assumed and
> > there is no provision to specify a different VA in the iSER header.
>
> Mike - I believe this VA field in the expanded iSER header is almost
> NEVER zero -- it is always an actual virtual address.
>
> >
> > Tagged offset (TO) refers to the offset within a tagged buffer
in RDMA Write
> > and RDMA Read Request Messages. When sending non-immediate unsolicited
> > data, Send Message types are used and the TO field is not present.
Instead,
> > the buffer offset is appropriately represented by the Buffer
Offset field in
> > the SCSI Data-Out PDU. Note that Tagged Offset is not the same
as write VA
> > and it does not appear in the iSER header.
> >
> >Mike
>
> On Wed, 11 Mar 2009, Black_David at emc.com wrote:
>
> > This is a reminder that the Storage Maintenance BOF will
> > be held in about 2 weeks at the IETF meetings in San Francisco.
> > Please plan to attend if you're interested:
> >
> > THURSDAY, March 26, 2009
> > Continental 1&2 TSV storm
Storage Maintenance BOF
> >
> > The BOF description is at:
> > http://www.ietf.org/mail-archive/web/ips/current/msg02669.html
> >
> > The initial agenda is here:
> > http://www.ietf.org/mail-archive/web/ips/current/msg02670.html
> >
> > I'm going to go upload that initial agenda as the BOF agenda,
> > and it can be bashed at the meeting.
> >
> > The primary purpose of this BOF is to answer two questions:
> > (1) What storage maintenance work (IP Storage, Remote Direct
> > Data Placement) should be done?
> > (2) Should an IETF Working Group be formed to undertake that
> > work?
> >
> > Everyone gets to weigh in on these decisions, even those who
> > can't attend the BOF meeting. Anyone who thinks that there
is
> > work that should be done, and who cannot come to the BOF meeting
> > should say so on the IPS or RDDP mailing lists (and it'd be a
> > good idea for those who can come to do this). As part of
the
> > email, please indicate how you're interested in helping (author
> > or co-author of specific drafts, promise to review and comment
> > on specific drafts).
> >
> > Here's a summary of the initial draft list of work items:
> > - iSCSI: Combine RFCs into one document, removing unused features.
> > - iSCSI: Interoperability report on what has been implemented
and
> > interoperates in support of Draft Standard status
for iSCSI.
> > - iSCSI: Add backwards-compatible features to support SAM-4.
> > - iFCP: The Address Translation mode of iFCP needs to be deprecated.
> > - RDDP MPA: Small startup update for MPI application support.
> > - iSER: A few minor updates based on InfiniBand experience.
> >
> > Additional work (e.g., updated/improved iSNS for iSCSI, MIB changes,
> > updated ipsec security profile [i.e., IKEv2-based]) is possible
if
> > there's interest.
> >
> > There are (at least) four possible outcomes:
> > (A) None of this work needs to be done.
> > (B) There are some small work items that make sense. Individual
> > drafts with a draft shepherd (i.e., David Black)
will
> > suffice.
> > (C) A working group is needed to undertake more complex work
> > items and reach consensus on design issues. The
WG can
> > be "virtual" and operate mostly via the
mailing list
> > until/unless controversial/contentious issues arise.
> > (D) There is a lot of complex work that is needed, and a WG
> > that will plan to meet at every IETF meeting should
be
> > formed.
> >
> > Please note that the IETF "rough consensus" process
requires a
> > working group in practice to be effective. This makes outcome
> > (C) look attractive to me, as:
> > - I'm coming under increasing pressure to limit travel, and
> > the next two IETF meetings after San Francisco are
not
> > in the US.
> > - I'd rather have the "rough consensus" process available
and
> > not need it than need it and not have it available.
> >
> > Setting an example for how to express interest ...
> >
> > ---------------
> > I think that the iSCSI single RFC and interoperability report
are
> > good ideas, but I want to see a bunch of people expressing interest
> > in these, as significant effort is involved. It might make
sense
> > to do the single iSCSI RFC but put off the interoperability report
> > (the resulting RFC would remain at Proposed Standard rather than
> > going to Draft Standard), as I'm not hearing about major iSCSI
> > interoperability issues.
> >
> > I think the latter four items (SAM-4 for iSCSI, deprecate iFCP
> > address translation, MPI fix to MPA and iSER fixes) should all
> > be done.
> >
> > I plan to author the iFCP address translation deprecation draft,
> > and review all other drafts.
> >
> > I think that a virtual WG should be formed that plans to do its
> > work primarily via the mailing list. I believe the SAM-4
work
> > by itself is complex enough to need a working group - I would
> > expect design issues to turn up at least there and in determining
> > whether to remove certain iSCSI features, but I'm cautiously
> > optimistic that the mailing list is sufficient to work these
> > issues out (and concerned that travel restrictions are likely
to
> > force use of the mailing list).
> >
> > -----------------
> >
> > Ok, who wants to go next?
> >
> > Thanks,
> > --David
> > ----------------------------------------------------
> > David L. Black, Distinguished Engineer
> > EMC Corporation, 176 South St., Hopkinton, MA 01748
> > +1 (508) 293-7953 FAX:
+1 (508) 293-7786
> > black_david at emc.com Mobile: +1 (978)
394-7754
> > ----------------------------------------------------
> > _______________________________________________
> > Ips mailing list
> > Ips at ietf.org
> > https://www.ietf.org/mailman/listinfo/ips
> >
> _______________________________________________
> rddp mailing list
> rddp at ietf.org
> https://www.ietf.org/mailman/listinfo/rddp