ɫ�ش�ý

40:38 Webinar

GenAI in Finance: Simplified, Scalable, and Ready for Real Time

This session covers simplified deployment, RAG workflows, reliable networking, real-time insights, and scalable AI compute for secure, low-latency finance AI.

This webinar first aired on 18 June 2025

The first 5 minute(s) of our recorded Webinars are open; however, if you are enjoying them, we��ll ask for a little information to finish watching.

00:00

Thanks for coming out everybody. Good morning. I appreciate everyone taking the time out. Hopefully you guys have been enjoying the event thus far. Um, like to welcome to our chat technically titled Gen AI and Finance, simplified, scalable and ready for real time. Um, the thing that's really a joy, I mean, if you anybody stop by my colleague Rob,

00:20

who's in the back there is flash talk yesterday, you probably heard a little bit about this, but um this isn't simply a panel discussion here today, right? Uh, I've got some esteemed guests here with me, and over the past few months, our collective firms have been working together. I spent a tremendous amount of time and effort building a solution in WWT's Advanced

00:40

Technology Center. Uh, and I'm here and thrilled to announce that officially as of today, that is GA is available for POC. We'd love for everybody to come, stop, engage us afterwards, get to know more about the solution itself later this afternoon.

00:55

My colleague Rob is also running an Ancillary panel, uh, I think it's at 2 o'clock, Rob. 1 o'clock. OK, 1 o'clock. Uh, that will deeper dive into the use cases that are in that lab. They're all geared towards real world financial service use cases,

01:10

uh, and we hope you find them interesting, right? So, uh, myself, I'm not on the slide here. I'm Mike Russo. I run Pura's financial services division, uh, or Vertical worldwide. Uh, one small change, Perbu, unfortunately from Nvidia had last minute something come up.

01:26

So Peru's not here today, uh, but he wanted to be. Uh, we will do our best as the remaining panelists to represent Nvidia, uh, to the best of our ability, right? But together we've created an OEM solution. It really brings together best in class ultra latency software and hardware, scalable AI infrastructure, and real-time analytics capabilities aimed at solving real

01:48

world challenges like trade ideation, portfolio optimization on the agentic side, research, analytics assistance, things of that nature, uh, and who knows, it might actually answer questions faster than the average portfolio manager will. Um, so let's kind of dive in, right? We're going to explore kind of the why and how,

02:06

as well as the values the solution brings, uh, over the next couple of minutes here. Uh, what I'm going to do is hand it over to my esteemed panelists here. Why don't you each run down maybe Nasraj start with you, go down the line, just to kind of brief introduction and and we'll kick it off. Shia, thank you. So I'm uh Naraj.

02:25

I head up the AI solutions group at KX Systems. Uh, it's, uh, it's, uh, basically KX makes something called KDB plus, uh, which has a language called Q. It's been around for almost 30 years. It's used, uh, specifically in high frequency trading, um, in defense missile systems,

02:44

space, and other places where you really need ultra low latency, extreme levels of, uh, database performance. Uh, I'm Philip, AI product manager from PR. I been working with this team over the last, uh, 6+ months. It's been a great experience, uh, you know, so yeah, thank you.

03:01

Hey everyone, I'm Ryan Avery with Worldwide Technologies. I'm a, uh, storage solution architect for high performance storage, and, uh, I'm also responsible for the storage that we have in our AI proving ground in Saint Louis. Good morning everyone. I'm Kevin Curtis, uh, business development with Super Micro.

03:20

I work a lot on partnerships, uh, with Super Micro for our high-end compute. Uh, we also are a, uh, key partner with Nvidia. So most of the things that we go out these days has Nvidia in it as well, and that's, uh, that's part of the panel, but I'll, uh, I'll be speaking a little bit for Nvidia.

03:38

Hi everyone, my name is Ed Chapman. I'm a VP of business development for RISA Networks. Um, hopefully some of you are familiar with us. We've, uh, got the number one data center networking market share. In the world and we're really, uh, happy to be participating in this partnership. We've done stuff before with Pure and Super Micro.

03:56

It's been successful in the past. I think we've got a great solution for FSI workloads, uh, going forward. I'm John Owings. I actually have been with Pure in lots of different roles over the last 12 years, but I am now the leader for our cloud native architecture team, so field facing people that work with our strategy when it comes to port

04:17

work. So you know, like I tell everybody, it all, it all runs on Kubernetes. So, you know, we're the, we're the Kubernetes SMEs within, uh, within Pure. Cool. Thanks guys. Um, all right, so let's kick it off, right? So financial institutions,

04:32

as most of us are probably aware, are absolutely inundated with fragmented data, uh, both structured and unstructured. As your panelists, you know, I'd love your opinion on this. So what makes our partnership uniquely capable of taming the chaos to support real-time decision making and things like trade ideation and other use cases.

04:50

Um, Kevin, if you don't mind, I'm gonna point this one. Why don't you kick us off and we'll go down the line. Sure. Um, so, as everybody knows, time is money, uh, and as I said, you know, data, the proliferation of new data created every day at an exponential rate.

05:06

Uh, and it's not only the new data, but it's the updated information that comes in constantly. Um, you know, how do you make valuable use of that? All the data in the world is great, but if you can't use it, you know, what value is there? Uh, so, you know, enter the, the Gen AIPod that, you know,

05:27

the conglomeration of us have, have put together. Uh, it was designed specifically for financial services and optimized to maximize the use of data for real-time transactions. And that has, uh. It's Trying to remove a lot of the complexities of it,

05:48

um, and there are a lot of, uh, challenges with ingesting data, processing it, storing it, retrieving it and making use of it. So, you know, from the super micro part the compute part, uh, we have partners Intel and video. So we have a very high density CPU GPU platform, uh, a very small footprint

06:10

that can do a lot of information with that data in conjunction with the other software and hardware here to make use of that data, remove the complexities, uh, so you have real time. Um Information that you can act on. Yeah, no, that's great. I mean, I think, you know,

06:31

we as Pure and Phil, I guess, keep me honest, JO as well, you know, I think the way we look at it when you think about fragmentation and data silos, like we know they're out there, we know they exist like we try to help firms break them down. I wouldn't say that we go and normalize data, but if we can get it one step closer to say semi normalized where you could then build your data lake and feed your models with it.

06:50

That's kind of our goal in this solution is making sure we break down those silos, make sure that we get teams able to share data across not only each other but across the organization as a whole to really make the most of their AI when it comes to monetizing the data that they already hold today. Yeah, also want to add like when it comes to finance, uh, you know, the kind of data is very multimodal.

07:10

You have, you know, a lot of graphs, images, text, it's not one mode. And so if you analyze the data size itself, it's very different. So you have 512 and that's 1024. So pure Excels at multi-dimensional multimodal performance. It's very different compared to the other storage vendors there.

07:29

It's not raw performance, multi-dimensional performance. So what does that mean to the research analysts, right? So you, you put in a query the time to first token, you, you get an immediate response because the, the, the multi-dimensional performance is much faster on, you know, the PDFs and data. The second is Your,

07:47

your hawk the curve on the amount of data produced by these financial institutions is growing. What that means is the data center size is fixed. So you know, your enterprise customers want to fill it with GPUs. That leaves very little space for storage, and we have to cram in as much as performance in

08:06

that small kind of space. So performance density based on our DFMs is something that we have innovated over the last 10 years, and this really plays into, you know, our strengths here so. Yeah, I think basically what we're saying is that you know, uh, in order to get that leverage, get that advantage like arbitrage advantage,

08:28

especially in the financial markets, you need a differentiated, you know, um, proposition, a solution stack, right? So you could be on AWS, you could be on aurer and you're using a certain stack that everyone else is using. What, what the team has done here is that they're trying to put all the best in breed, right? You have pure storage, you have Arista for the

08:46

net networking. You have super microfiber computing. You have Nvidia creating all these great blackwe, you know, meshes, and then you have KB plus cakes doing the real-time data processing. So that's the kind of, you know, tack that hedge funds are going to look for that places where you need that kind of, you know, you're trying to get the alpha,

09:05

that's where you, you, you need that kind of a stack. Yeah, no, absolutely, and I think any other commentary before I segue on to, well, the one thing I, I'm excited about, um. In regards to the fact that we're all working together, we're, we're creating a solution and a lot of times when I've talked to customers who are deploying,

09:24

especially it's not the, the large cloud guys who are building out these massive GPU clusters and stuff. It really is, you know, organizations that maybe not have that core expertise, but here they have that opportunity that we can provide them with a solution and we've worked with WWT and other opportunities in the FSI market Arista has in doing testing.

09:42

And and getting that experience and testing, allowing the these these financial services organizations to test the applications within that lab and then really have a better understanding of what's gonna come out of that come out of that and I think that's. A really good thing for organizations that are trying to figure out what are they gonna be running, uh, why are they running applications for AI and testing that in that environment.

10:05

So I'm excited about that and the fact that we're all together providing I guess a curated solution out of the marketplace is great. Yeah, likewise, I mean, that's definitely a huge part of the excitement. Obviously we all partner closely with Worldwide Technology, but having a place where we can actually offer clients and prospects, hey, go, go to the lab,

10:23

put your hands on it, see what it's capable of, have that experience. It just removes some of the, the hurdles and the challenges firms may be facing when they think about buy versus build, when they think about the tech stacks that they may want to deploy versus what they have today. It it's just a great place to, uh, to welcome people to.

10:39

So now I appreciate everybody's partnership here. Thank you all. Um, so let's go on. Question two. so I think notourage highlight it, right, as many of you guys probably are already aware, uh, KX Systems KDB plus is really well known throughout the industry for ultrafast time

10:55

series analytics, but now we introduced retrieval augmented generation, right? Ads a whole new dimension. Um, I'm gonna start with Naourage, and then we'll kind of move down to to Ryan about some of the results we're seeing, right? But, um, Notourage, can you explain how KDB.AI.

11:10

You know, integrated with the Nvidia AI enterprise software facil facilitates use cases like the AI research assistant, uh, the agentic AI, which kind of both rely on structured and unstructured data at scale at almost probably zero latency. So easy, easy layup question for you. 0 would be very, you know, controversial. They're like 5 millisecond.

11:35

And we just lost like a billion dollars. So, uh, well, you know, so KB plus has been mainly for structured data right for all these years. It's written by someone named Arthur Whitney, came from Ken Everson. It's a language called APL, so it goes back to the 70s, you know.

11:53

Uh, and, and, you know, once the whole GI thing started, there was a question like should we also have vector databases because, um, you know, vectors were already supported. It's, it's a language used by quants all the quantas so numerical areas are already supported so it's really a matter of creating that sort of, um, you know, product or value proposition where you so it could be also have a vector TB.

12:15

It's not just any vector database, you know, there's, there's lots of optimizations like GPU optimizations, lip glass optimizations, uh, and you know, I, I don't know how many people have heard of KDB.AI because you've probably heard of, you know, other competing products in the market, and part of the reason is that again these things live in a certain niche of the world, but it's not like you cannot use it as a

12:37

vector database for any other use case, right? Um, so what that did essentially is that with the advent of GAI and so on, now we do not need to limit ourselves to just the quant sort of numerical quantitative modeling, but we can supplement that with information from like social media feeds with. Information from the security uh SEC filings so you kind of correlate the two so you can ask

13:01

something like what happened with Apple during COVID, when should I have, you know, had my best bet, and it can look into the structured data correlate that with the filings, the news information, and kind of give you an answer. So there's quite a few products. There's a product for, um, for research assistant where you kind of ask questions and it does that correlation.

13:21

Uh, there's something for portfolio managers. So what's my optimal portfolio? There's something for, um, um, for alpha beta, uh, extraction. So where's the market alpha? Like alpha is basically beta is the market, alpha is where you get your edge, and we're not, you're not looking for a hedge fund, you're not looking for like a 6, 10%, you know, uh advantage over the 50% mark,

13:43

right? You might be just looking at a. 50.27% accuracy is a 4.27 person can translate into billions of dollars. So and plus you know there's deep learning, uh, forecasting and other other stuff in there too. It's just that you know all our products are named AI powered, AI powered, and but I, you know, now it is unless you put AI powered,

14:03

you know, people don't kind of realize it's AI powered so but yeah, it's, it's a pretty nice suite of applications. It's worth, you know, trying it out. Yeah, no, understood. I also congrats. I saw the announcement the other day with the approved blue blueprint with Nvidia around the

14:18

agentic side, and I was reading through that. I kind of shared it with a few people. Um, I love the fact that there's, you know, multiple agents could be running multiple queries at once and then taking the structured and the unstructured side, mashing the questions together, coming out with a really well rounded answer. Um, and that in my mind is just like, you know, the technological advancement,

14:36

AI itself and then retrieval augmented generation going into agentic, like it's just game changing when you think about the amount of time that can be saved, the amount of inference that can be gained over a short span of time. It's going to be really interesting, I think, to see not only as firms start to use that on their own, but those that will visit the lab, what they'll be able to see and what they'll be

14:56

capable of going forward. You know, just uh. I guess as a follow up, but I'm gonna aim this one at Ryan. So, um, not everybody in the room may be fully aware of what the Worldwide Technology Advanced Testing Center is so would you mind giving everybody just a quick verbal breakdown like what is it,

15:12

what we're doing in there, uh, and also some of the results we've seen to date in the lab? Sure, so excuse me, uh, so in Saint Louis we've built out, uh, uh, I think we're up to close to $2 billion worth of investment. In different technologies, uh, far beyond storage, so networking compute application level, and we've, uh, we've built out portals where customers can access the stuff that we

15:36

build in there so that we can do, uh, POCs on demand, we can do our own internal testing, uh, and, and, and really our sellers are, are very well trained to to sell through our our ATC and our proving grounds so that. The capabilities that we have in there can help reinforce the messaging that we're doing in the field when we're talking about AI or anything else, so,

15:56

uh, again just a a massive amount of talent and investment that we've put into that over the years and about I guess 2 years ago we decided that we needed something more focused specifically on AI so we built out what we call the AI proving ground and uh so my portion of that is specifically around storage for AI and high performance computing use cases. So what we did here uh was built out this entire stack in our lab.

16:20

Uh, from every single component of it so that we could do our own testing as well as POCs with customers and, uh, as far as what we've seen so far, um, we've seen. Just uh an incredible um. Uh, ease of set up, uh, compared to some of the other products that that we've tested,

16:37

so retrieval augmented generation is, is widely popular right now. Everybody's talking about it, uh, and there's a few different ways to do it, um, you know, there are some solutions that have these, uh, that have rag integrated into physical hardware, uh, others are just very much a a build your own based on certain architecture.

16:54

So what, what they've done, uh, with the ease of deployment and the, the, uh. Architecture that's built around it makes it so much faster to deploy than some of the other stuff we've seen, so that's a huge bonus, especially for people trying to get up and running quickly and don't wanna science kit this thing for months and then um.

17:14

The, the performance of it has also been incredibly impressive again compared to what we've seen maybe with some of the more integrated solutions, um, the those performance numbers uh around this solution seem to be um leading the pack from what we've seen so. No, thank you for that. Um, OK, so let's, let's flip to a different topic, right?

17:34

Let's talk about scalability for a moment because I think when we start thinking about AI, the next biggest thing is, OK, how big can it get? How fast can it go, but like it's also like, how do we house it, what scale can we run at, right? So, um, Kubernetti's wasn't originally designed with AI in mind,

17:49

and yet here we are, right? So. Um, J, I'm gonna aim this one at you. How is the pure platform helping enterprises extend Kernetti to meet the unique demands of AI training and inference and scale? And, uh, when paired with our partners, how does the solution architecture support both the hedge fund experimenting with things like Gen AI all the way up to a global bank deploying

18:13

across hundreds of users and potentially hundreds of use cases? Yeah. So I mean and Kubernetti's wasn't designed with storage in mind right? so that there was a there was a gap there that we had to fill, you know, several years ago and you know obviously you know our acquisition of Port works we were able to uh build a platform that can control that data and it's gonna be

18:35

everywhere so like there's gonna be data that's unstructured there's gonna be data in databases there's gonna be data in other places and, and building that intelligence into Kubernetes is, you know what we did and in that way we can scale. Because there are, there are ways you can, you know, you can go all the way down to, I, you know, I know how to mount an NFS mount and just do it manually,

18:55

right, but, uh, that doesn't scale that that's not going to drive, you know, beyond one big file share, um, in these type of environments, right, there's lots of different types of data whether it's caches, databases, the ve you know, all the, all the things that have to, to go into that. And so, um, in this solution,

19:15

right, it is now because Kubernetti is kind of that central point. It's the scheduler it decides where where what goes where and you know how to get access to the GPUs, those, those types of things, um, on the other side is Port works will then deliver where the where the data is, right? It builds that intelligence and tells, tells Kubernetes, hey, the data for that app, whatever, you know,

19:36

whatever it might be is on the flash plate or it's on this, this. Right, so all those things, all that connectivity, it needs to be able to be abstracted in a way where the people that actually consume these solutions, especially at the financials, they're not Kubernettis, they're not HPC teams.

19:56

These are, these are, uh, PhDs in math that you know and the the more that there's more time they're spending tweaking infrastructure is less time they're doing what they're supposed to be doing. Um, and so that's why we wanna be able to extract that give them a template that says, hey, you know, do this and you can start working on what, what's important to you and and to the business,

20:17

right? And so, uh, session I'm doing later today, it's about speed, speed to value right time to value when it comes to AI because if you're messing with lots of like bespoke settings. Um, and not doing something like this A iPod, you, you're, you're looking at like, oh, we bought a bunch of stuff and we,

20:38

you know, we took a year to make it work and now we can start doing things, right? Like if we could shrink that time, that's where all the, that's where we see at least when we talk to the financials that's what they're looking for is like we can't spend 2 years figuring out how to make all this work before the PhDs get a hold of it. We need the,

20:56

the, the. The shorter that that lead time is, the faster we can start differentiating and how we're, how we're delivering it. Yeah, no, for sure. I mean, I think, and Rob and I and the conversations we've gotten into in any financial firm looking at AI microservice based deployment is definitely high on the list. And then when you think about some of the

21:13

things we've probably heard in the keynotes like Fusion and EXA and things of that nature, when you think about the scale that's actually potential to to house on pure and the simplicity we can bring. And then pair all this, the other great vendors and partners we have in this stack, like it really makes for a game changing difference. Yeah.

21:31

And it will scale, right? Like so the big thing is that, you know, you could start now. Start getting value and you go, OK, this is working and add to it. That's, that's the point is like not, you know, and you know nobody's here, but nobody, nobody wants to buy a super pod the first day and then decide,

21:47

oh we just wasted $120 million. Like, you know, we, you know, although if you want to buy one, I'm sure you know we can help you with. I don't think anyone on the panel would complain, but I think there could be a couple of sea level attacks they're looking for jobs after that one. Um, OK, so let's say, so some financial firms, um, already have made large AI investments,

22:09

right? So I guess the the general question I'll I'll start with Naourage, but Philip as well, Ryan, chime in. Like, why would they even care about the stack, right? So obviously we all put our heads together, we're like, this is going to be great, everyone's gonna love it,

22:21

right? But like, in your opinion, why would they care about it? And um. You know, as a follow on, what's the business case for integrating the solution now versus waiting, right? So it's kind of like, I know there's money in the financial sector, we all see it. Hedge funds is not so much.

22:35

I don't think they do wait and see, but unlike the global banking side, it always feels like it's like someone's always the first in the pool with like the toe dip and then everybody goes to jump in the deep end, but It takes a while for them to get there. So, yeah, so just, you know, what's what's the opinion on like, why I care about it and why. Yeah, yeah. Yeah.

22:53

So, well, firstly, you know, this is, this is an architecture that that's sort of on the on the very high end of performance, right? It's not like you have to use this architecture, there are elements of this architecture that you can plug and play with. But I, I think more generally, you know, like I've, I used to do POCs with vendors and it takes months, you know, maybe years to find the right, you know,

23:14

solution just for storage we we have done so many different POCs. What, what you have here is sort of a POC proof in a solution that has already gone through that, that rigor, you know, it's gone through the rigor of the performance, um, on multiple levels of the entire computing ecosystem. So and it gives you an architecture, sort of a blueprint to start out with,

23:37

and I didn't mention too much about the Nvidia stuff, but there are many components within Nvidia like CUDF or CUVS, you know, so they're almost like replacements for your Python code, whatever your data scientists are using, but you just put that code in the stack and everything just speeds up like 10X, 100X. So that's one. The second thing is a lot of the data, the market data that you have,

23:58

we already have preloaded or loaders for them. So it's not like when you're loading your market data in uh in any vector database you know you need to start maybe from first principles, but here you already have a blueprint that you can follow and say, OK, I don't need to spend like, you know, days or weeks trying to figure out what's the optimal combination,

24:18

but I can just, you know, load it up very quickly. So, um, you know, yeah, cool. I'd like to add a few more points, right? So we like to advertise the time to deploy the stack is 90% better than the competition, and there are a few reasons why. So the PR engineers spent a lot of time working with Ed's,

24:37

uh, network engineers to right size the architecture. It's working getting the architectural sizing correct is, uh, it's there's art and science to it. So right, so we have small, medium, large sizes for. Different size of the models, different number of users.

24:52

We also worked with Super Micro and and media for the sizing between OBX and HGX, right? What to, what to, uh, you know, what to choose and how do you educate the customer, right? So the infras amount of knowledge between these three companies adds, you know, we've done a lot of pre-work for the end customers, so they don't need to go do all that work there.

25:13

Second is on the software side working with KX and WWT. Making sure the installed scripts and the custom scripts are like, you know, absolutely clean, right. So recently we were working with a very big bank. The deployment of the media AI enterprise was the biggest bottleneck there. So getting that hashed out and thanks to WWT

25:34

who's got a lot of experience there, contributed to the seamless turnkey validated stack, right? So. And this is one of the very few vertically focused stacks. There's a lot of horizontal stacks in the market like similar AIPods, but, uh, um, it's probably the only only one there which is it built for finance out there. So yeah, yeah, no,

25:54

that's great. And I mean, Ryan, just looking to you, obviously worldwide technology, you guys are a trusted advisor to to many in the financial sector. A lot of people turn to you whether it's they want to try something new, whether they're debating what they should be building as the best stack, um. I think the question to you is,

26:10

you know, what are you guys seeing right now with, you know, the inbound traffic as far as what people are looking for and kind of like the why now, right? Timing wise, where, where do you think why is now the right time for this? Yeah, so we do have a lot of customers in the financial space that all come to us and and kind of want to know what everyone else is

26:27

doing. Nobody wants to be the last one. timing is critical, I think, uh, in that market. So you know we we touched on some of this a little bit here but the you know their IT staffs are are strained. They don't have time to pick up something new and complicated that's gonna take months and months to roll out so the faster they can

26:44

deploy these solutions, I, I think helps their their case immensely to be able to get value out of that, uh, that investment quickly and then. Um, the, the added bonus that we've got in in this environment is that Pure already exists in most of these companies, so these, uh, these IT staffs are already familiar with Pure and what they bring to the market, uh, and that just makes that step that much easier for them to

27:07

roll into their, their daily practice, uh, and, and again get that value out of it quickly. And what I'll add some of that too that I'm excited about is, is, is customers can get to see the outcome first. They can work with you guys in the proven ground to kind of see, oh, this does work, um, and that speeds it up, right?

27:25

It's there's not, there's not gonna be a 9 month POC of like, well, does it really work and then we all ship gear out there and right is they can come to you, they could see some outcomes right away and then we can, we can go from there. Yeah, the fact that we've got that built in our lab today, uh, and, and that we can quickly, uh, allow customers access to it to see what this

27:43

looks like we don't, we don't bring their data in but we're, you know, we're able to simulate data or, or, you know, show them sample data or whatever, but, um, that, that step just cuts a whole bunch of time out of that process and and allows them to. Understand better what we're, what we're offering, and 11 more point on that is you

28:00

already have the investment in AI. This is kind of an enhancement to your existing investment to get a faster ROI on your money. Uh, the good thing is, as everybody said, you don't have to figure it out. We've already done it for you. Uh, it's ready to go, easy to deploy, um, and it's, it's time to money,

28:20

um, kind of off the shelf. Awesome. No thanks guys. Conscious of time, uh, one more I want to get out of there before we're out of here is, uh, so, uh, Kevin, probably Ryan also Ed, so security and data sovereignty are paramount in financial services. I'm sure the audience is all too well aware of that, right? So how does our joint solution ensure safe,

28:43

high performance AI without compromising data privacy or introducing additional risk to what already exists today? Uh, first and foremost, it's, it's on site. So you, you manage your own data, you manage your own platform, you don't have to worry about.

29:03

Sending it out to the cloud or some third party, uh, you know, you, you have oversight into everything that, uh, you have available because it's, it's your equipment, it's on your site, it's under your control. Um, so it's, it's, you know, your responsibility to, to, to manage that security for that, that, uh, that site.

29:27

Yeah I think the other thing is, you know, um, because we've tested all of it, the configurations, those are all gonna work. Each one of the partners here have very capable security features and functionality they've been embedded into the system. I mean, for instance, you know, from Arista perspective we've have a single operating

29:45

system we have for years. A lot of our financial services customers have been using us, you know, deployed across their environments. And you know they recognize that what we've done, uh, in terms of limiting the amount of critical vulnerabilities and exposures you know compared to our competitors you know we've had 25 over the past 5 years,

30:01

some of our competitors about 500, so that we intrinsically have this knowledge about how we're going to harden these environments, uh, make sure that they're secured, and the fact that this configuration has been curated, you're not gonna have misconfigurations potentially open, um, you know. Open vulnerabilities in terms of passwords and all the rest of that stuff,

30:22

uh, I think that adds additional layers of security, uh, and reduce the vulnerabilities, uh, from anyone getting access into the stack. Yeah, and maybe I'll just add from more of the AI perspective, you know, there's this field of responsible AI and ethical AI and so on. So your answers are always going to have citations.

30:42

They're going to have references back to the actual documents and then in terms of safety, right, there are guard rails and other measures. So guard rails are the components within Nvidia that you can, you know, put in place so that you ensure that your model is returning results that are going to be grounded in in in in real facts. Um, more generally, uh, you know, I'll say it because we're talking about this stack,

31:05

um, this is, is, yes, it's meant for finance, and it's just that in finance you get a lot of these innovations, sort of pioneering differentiating technologies, you know, it is not that is not to say you cannot use it in other domains, right? So Lockheed Martin uses. The US Army was the first buyer of KDB plus, so it was,

31:25

it was private for the longest time, you know, non-commercial, uh, so our molecular modeling, you name it, right? So it's just that in finance we have this very high velocity milliseconds, microseconds, nanoseconds, so you push the boundaries to the point where you end up with a stack like this. Yeah, no, that's excellent. I think we probably have one time like more

31:44

time I could sneak one more in. So, and it's a good segues perfectly, right? So looking ahead the next 1218 months, what new financial service use cases maybe and or other? Do you think that this solution could potentially Unlocked that isn't really being fully tapped yet. And, you know, maybe not as you start and fill

32:02

up if you want to opine after, and we'll, we'll see where we get you on time. Sure. So you're going to see more and more personalization. So based on your, you know, let's say the the the client's attributes, your portfolios being personalized automatically, so more of the bots managing

32:18

your portfolios. Um, you're gonna see more of multimodal. So can you have, you know, can you read the charts and make a trading decision, smart auto routing based on charts and so on. Um, you're going to see also more of agenttic, but the thing is,

32:32

you know, everyone's saying agenttic so much it's almost become a cliche like it's agent take, yes, so, but it's basically the concept of you're going to, uh, come synthesize information from multiple different sources and try to come up with a sort of, you know, source of truth. So yeah, I think there's a, there's a lot of possibilities and.

32:50

You know, at, at this point I just don't say anything about what is not possible because I remember like 1510 years ago I was at a conference and they were saying, you know, will AI be AGI? I said no, it's not gonna happen and here we are, you know, so yeah, yeah, a few points like that is, uh, I think the rate of which reasoning models are getting developed is really scary,

33:10

right? And it's almost a past close to human IQ, right? And, uh, we had models which are very creative. You can summarize and you're in. But probably the last 6 months and probably the next 12 to 18 months, I think reasoning models will get a very, very much better and also intelligence will get

33:29

cheaper. Now I think the cost per token like uh uh like for open open AI and anthropic that is exponentially coming down so intelligence is getting better and cheaper. So things like algorithmic trading where you really require intelligence to do first principle thinking that will probably, you know, you'll have agents do it,

33:46

right? So I, I think that that's and those to really improve the top lines of banks and probably, yeah, that that's my theory. Anybody else. That's I do have to say that I do agree that agentic has become the new cloud washing word. It's almost like a drinking games out if it pops up,

34:09

you drunk by the third slide. Exactly. No, I, I fully agree. It's definitely something I just keep hearing more and more and more about, but good, good that we can address it, so no, no harm, no foul there. um, conscious of time, I do want to open up to the audience.

34:24

Anything, questions, comments? Don't be shy. So with regards to data intelligence, what are you guys doing to make the data that you check more intelligent? So you mentioned multiple earlier, right, obviously Nirvana is to be able to have, you know, real-time indexes on that multimodal data.

34:46

Are you seeing any growth in that section? Yeah, so, well, KTV plus was primarily being used for streaming analytics, right? So you'd have tens of millions of prices coming out to you and you need to do like a moving average re apps and things like that, you know, in, in very short, um, low latency way.

35:04

So we are seeing uh the quant desks actually looking at based on the time horizon of course some quantex are operating like millisecs or microsecs. Some are operating more like, you know, maybe an hour or two because they're looking at, you know, broader market shifts, but we are seeing Quanex actually using that, that, that real time.

35:23

Price movements as well as the news that's coming out and trying to, I mean there's a whole modeling process that goes into it so you have to you know to do benchmark testing and all that but to answer your question, yes, so that's that has already started that there we work with every large major bank and their front desks and you know that's already in the works.

35:42

Yeah, so real-time rag is becoming mainstream, and I think there's libraries to enable that and, uh, you know, uh, working with GPU and the KX and, uh, you know, with an offloading, uh, computations, right? So the latencies are very small, so you get all, it's like instant insights, right? I mean, you have a new article coming in you

36:02

can talk to it, right? So yeah, yeah, we did some benchmarks if I can just add to this, right, we did some benchmarks in terms of the rag in a retrieval of the documents and there are competing products in the market. We know what what time it takes. So let's say you have a query and you have a million. Indexes a million documents and you're trying

36:19

to find which document chunks answer your answers your question. So you have to do a pretty huge matrix multiplication, you know, a mammoth like 1536 values versus 1 million times 1536 values. So, uh, I, we have benchmarked it. I don't want to take names, but we have benchmarked it in the pre in the common vector databases.

36:40

Then we put it into KDB plus, uh, KDB.AI. The the CPU version was. Sort of, I wouldn't say comparable, but you could still say very well uh architected general vector TV would do it. Then we switched to the GPU mode like the lip gloss optimized to to find the similar documents out of 1 million, you know, embedded documents was 7 microseconds.

37:02

I was stunned like 7 microsecons on a single machine you could be running the entire, you know, bank if you will, but I mean that's the kind of performance you end up with. Awesome. That's impressive. Questions I saw one more hand. Um Can you picture the smallest building block of your?

37:24

What are visualizing, what is it? One pure That So what is the smallest human blood of yours? Because ultimately after the testing on uh Yeah, yeah, yeah, so I can start and maybe uh I can go. So, so it's a two node HGX system that we have as a small right 8 way right uh H100s or 8 and

37:56

uh we are starting with FBS 200, we can scale up to S500 and probably the bigger ones that we have and, uh, and you wanna comment on the network and the Tomahawk and all yeah, I mean either it's a 32 port or 64 port uh switch depending upon the configuration, but you're. It's gonna be two switches. I mean, for for that because you have the redundancy, yeah.

38:18

Cain. Probably about a half rack maybe. Are you talking about the smallest size? Yeah, yeah, the software takes up no space as well. Yeah, exactly. JO's got the easy one on that one. Now I think we've designed the solution to be a little bit flexible in the sense of like it's

38:35

not the, hey, we all threw like the Ferrari at it, so it's this monstrous system that costs a ton of money off the bat. It is highly configurable. You can pick and choose like the different sizes of pure, the network kit, the super micro side.

38:48

Nvidia case licensing, etc. so we've tried to build something with a bit of flexibility, so something for everybody, but it'll it'll range because I know that, you know, a quant shop may look at one thing, a global bank may look at another. We want to make sure we fine tuned it enough that you can scale big, but if you want to start small,

39:06

you absolutely can. There's no forced minimum footprint. Yeah, sometimes in the the TCO part can be a bit deceptive. It might seem like, you know, it's a small a small space and it's, let's say $10 right? But if you wanted to build the same architecture. I've seen examples where it would taken us like

39:24

a year with the data breaks team to do the same stuff that we did like a 3 or 4 person, you know, unit on a single, you know, AW server like it just works fascinatingly, incredibly well. So you know there's, there's that element that sometimes gets overlooked because you know, we think, OK, it's very niche or something, you know, yeah.

39:45

Absolutely. OK, well, I think we're just at time, so before we close out, I do want to point to everybody 1 o'clock this afternoon, my colleague Rob and Natara are gonna run a separate session that is going to deep dive into the actual use cases that are built in the testing center. So if you have further curiosity like what's actually in there,

40:01

if I have interest in testing, what may I be able to test. Definitely recommend going to check that out. The only other call of action I have is if you're interested, come approach any of us on the floor, whether it's in here, whether it's on the floor outside organically.

40:14

Happy to talk more about it at length, and you guys have all of our contact here. I'm sure some pictures have been taken. Feel free to reach out to us after the event as well if you'd like to further the conversation. Other than that, thank you all for your time. We really appreciate it. Hope you enjoy the talk. Right.

Artificial Intelligence
Pure//Accelerate

Continue Watching

We hope you found this preview valuable. To continue watching this video please provide your information below.

This site is protected by reCAPTCHA and the Google and apply.

Your Browser Is No Longer Supported!

Older browsers often represent security risks. In order to deliver the best possible experience when using our site, please update to any of these latest browsers.